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THE STRUCTURE OF 


INTELLECT 


J. P. GUILFORD 


University of Southern California 


It is the purpose of this report to 
describe a developing picture of the 
structure of human, adult intellect, 
as seen in terms of factors. Although 
the picture is incomplete, presenting 
it at this time seems desirable for two 
reasons. The picture now includes 
about forty different factors, most of 
which are generally unfamiliar. Many 
have only recently been demon- 
strated. Enough of the intellectual 
factors are known to suggest strongly 
the outlines of a system. The svstem 
has interesting theoretical implica- 
tions, and, by reason of certain vacan- 
cies that appear, it points to still un- 
discovered factors, somewhat as the 
chemist’s periodic table has served 
to indicate unknown elements. 

As the writer has emphasized _ be- 
fore (10, 13), psychology and psychol- 
ogists since Binet have taken a much 
too restricted view of human intelli- 
gence. We do not need to go into the 
reasons here. They can be summed 
up in a positive manner by saying 
that in attempting to fathom the na- 
ture of intellect) more attention 
should be given to the human adult, 
particularly the superior human adult. 
It is to such specimens that we must 
go, if we are to investigate intellectual 
qualities and functions in their great 
est scope and variety. 

The advent of multiple-factor an 
alysis has done something to broaden 
and enrich our conception of human 
intelligence, but factor theory and 


the results of factor analysis have 
had little effect upon the practices of 
measurement of intelligence. We do 
have a great variety of tests in such 
intelligence scales as the Binet and 
its revisions and in the Wechsler 
scales, to be sure. Too commonly, 
however, a single score is the only 
information utilized, and this single 
score is usually dominated by vari- 
ance in only one or two factors. 
There is some indication of more gen- 
eral use of part scores, as in connec- 
tion with the Wechsler tests, but each 
of these scores is usually factoriallv 
complex and its psychological mean- 
ing is largely unknown as well as am- 
biguous. The list of factors that is 
to be presented in this article should 
clearly demonstrate the very limited 
information that a single score can 
individual, and 
on the other hand, the rich possi- 
bilities that those factors offer for 
more complete and more meaningful 
assessments of the intellects of per- 


sons. 


give concerning an 


Some seven vears ago the writer 
initiated research aimed essentially 
at the study of adult, human intelli- 
gence, in a project on ‘aptitudes of 
high-level personnel.” In some re- 


1 Project 150-044, under Contract N6éonr- 
23810, with the Office of Naval Research, 
monitored by the Personnel and Training 
Branch. Among those who have made the 
most significant contributions to the project 
are: Raymond M. Berger, Paul R. Christen- 
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spects this has been a continuation of 
wartime research in the AAF Avia- 
tion Psychology Research Program 
(21). The project was initiated with 
the conviction that the full scope of 
human intellect had not yet been ex- 
plored, by factor-analysis methods 
or by any other methods. Thinking 
abilities, which have played impor- 
tant roles in some definitions of in- 
telligence, seemed to have been ne- 
glected; particularly abilities having 
to do with productive thinking. Ac- 
cordingly, four areas of thinking were 
selected for study, arbitrarily desig- 
nated as reasoning, creativity, plan- 
ning, and evaluation. While abilities 
belong to the context of individual 
differences, they also imply psycho- 
logical functions of individuals. 
Hence it was thought that the find- 
ings would have much to offer toward 
an understanding of human thinking 
of various kinds, including problem 
solving. 

Space does not permit describing 
in detail the research procedures; 
they have been described in the vari- 
ous technical reports from the apti- 
tudes project (14, 15). It should be 
pointed out, however, that the factor 
analyses were done in a research de- 
sign that includes experimental fea- 
tures. Each investigation starts by 
hypothesizing that certain unitary 
abilities (psychological factors) exist 
and that they have certain proper- 
ties. Psychological tests are then 
selected, adapted, and constructed 
for each hypothesized factor in a 
way that should lead to a ‘“‘yves’’ or 
“no” answer from the analysis. The 
results should show that the factor 
hypothesized does or does not exist 


sen, Andrew L. Comrey, Russel F. Green, 
Alfred F. Hertzka, Norman W. Kettner, and 
Robert C. Wilson. I am particularly indebted 
to Christensen and Kettner for reading the 
preliminary draft of this paper, and to Philip 
R. Merrifield, also, for making suggestions. 





GUILFORD 


and it does or does not have the prop- 
erties suggested. Thus, the kind of 
psychological test is an important 
independent variable, more or less 
under the control of the investigator. 
Certain other experimental variables 
are held relatively constant—the 
testing conditions and certain popu- 
lation features, such as sex, age, edu- 
cation, and motivation. The exam- 
inees have been men who were pre- 
viously selected for military training 
leading to an_ officer’s commission 
and they have been tested under 
ordinary military discipline. 

In his survey of aptitude factors, 
published in 1951, French (8) listed, 
among others, 18 or 19 factors that 
can be classified as intellectual. Our 
investigations of thinking abilities 
have verified and helped to clarify 
many of these factors, besides intro- 
ducing approximately as many new 
ones. Other recent investigations 
have also contributed new informa- 
tion regarding factors. The list pre- 
sented here comes from all these 
sources, 


CLASSES OF INTELLECTUAL FACTORS 


Inspection of the total list shows 
that the intellectual factors fall into 
two major groups—thinking and 
memory factors. The great majority 
of them can be regarded as thinking 
factors. Within this group, a three- 
fold division appears—cognition (dis- 
covery) factors, production factors, 
and evaluation factors. The produc- 
tion group can be significantly sub- 
divided into a class of convergent- 
thinking abilities and a class of di- 
vergent-thinking abilities.” 


Cognition (Discovery) Factors 


The cognition factors have to do 
with becoming aware of mental items 
2 In the system of the intellectual factors to 


be described here, the reader will find some 
striking similarities to a system developed in- 
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or constructs of one kind or another. 
In the tests of these factors, some- 
thing must be comprehended, recog- 
nized, or discovered by the examinee. 
They represent functions on the re- 
ceiving side of behavior sequences. 
The cognition abilities can be dif 
ferentiated along the 
major principles. For some time we 
have been aware that thinking fac- 
tors tend to pair off according to the 
material or content used in the tests 
For each factor of a certain kind 
found in verbal tests there seemed to 
be a mate found in tests composed of 
We found, for ex- 
called 
parallel with a 
factor called eduction of conceptual 
called 
parallel to one called con- 
ceptual fores ight; and a factor of per 


ceptual 


lines of two 


figures or designs. 
ample, a 


perceptual 


factor 
relation 
factor 


3 : ae 
relations; a perceptual 


P 7 

foresignt, 
classification, with 
Only 


creas! 


parallel 


one ol conceptual classification, 


been 


has 
third 
were 


there 
evidence for a 


recently 


content cate- 


gory. Factors found in tests 
whose contents are letters, or equiva- 
lent 


ceived 


svmbols, where neither per 


form. or verbal! 
the 


Factors based upon 


heures nor 


meaning 1s basis of operatiot 


this tvpe of ma 
terial have been found, parallel to 
other factors where the test content 
is figural or verbal. Thus a third con 
tent category seems necessary. 

A second major principle by which 
cognition factors mav_ be 
ated psychologically depends upon 
the kind of thing disc overed : whether 
it is 


differenti- 


a relation, a class, or a pattern, 


and so on. Thus, for each combina 
tion of content and thing discovered, 
we have a potential factor. The cog- 
nition therefore be ar- 
ranged in a matrix as shown in Table 
1. The third and fourth rows seem 


to be complete at the present time. 


factors can 


dependently by Burt (2 
support for the idea that a system does exist. 


The similarities are 
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There are vacancies in the other four 
rows. With each factor name are usu- 
ally given two representative tests by 
name to help give the factor opera- 
tional meaning.* A word or two will 
be said in addition regarding the less 
familiar tests.4 

It should not be surprising to find 
the factor of verbal comprehension, 
the best known, and the dominant 
one in verbal-intelligence tests gen- 
erally, in the first row of the cogni- 
tion factors and in the conceptual 
column. The fact that the cognition 
sometimes threes 
leads us to look for parallel factors 
for the perceptual and_ structural 
columns. One candidate for the 
perceptual cell in this row would be 
the well-known factor of perceptual 
peed. This factor has to do with dis- 
criminations of small 
form rather than in 


factors come in 


differences in 
awareness of 
total figures, hence it does not quite 
fill the requirement of parallel prop- 
erties with verbal comprehension. A 
better factor for this purpose is the 
one Thurstone (28) called “speed and 
streneth of closure,” called figural 
Table 1. For this factor, 
awareness of perceived objects from 


ciosire in 


limited cues is the key property. The 
limitation of cues Is necessary to 
make the test sufficiently difficult for 
testing purposes. 

There is no known factor that 
seems to belong in the second column 
of the first row of Table 1. In gen- 
eralizing the class of three such fac- 
tors, and in differentiation from 
other classes in Table 1, it is clear 
that those in the first row have to do 
with awareness of items, elements, or 
things. To this category 
Spearman’s term “fundament”’ 


been adopted. 


denote 


has 


It should not be inferred that these are the 
only kinds of tests related to the factor. 
4 For more complete des riptions of the tests 


see particularly (14, 17, 21). 
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TABLE 1 


CoGNITION (D1scOVERY) FACTORS 


Type of thing 
known or 
discovered Figural 


Fundaments Figural closure 
Street Gestalt 
Completion 
Mutilated Words 
Classes Perceptual classifica- 
tion 
Figure Classification 
Picture Classification 


Relations Eduction of percep- 
tual relations 

Figure Analogies 

Figure Matrix 


Patterns or systems Spatial orientation 
Spatial Orientation 
Flags, Figures, Cards 


Problems 


Perceptual foresight 

Competitive  Plan- 
ning 

Route Planning 


Implications 


Two factors involving ability to 
recognize classes are known, one in 
which the class is formed on the basis 
of figural properties and the other on 
the basis of meanings. It was inter- 
esting that the Picture Classification 
test had more relation to the percep- 
tual-classification factor than to the 
conceptual-classification factor in spite 
of the fact that the things to be classi- 
fied were common objects, the basis 
for whose classification was intended 
to be their meanings. This might 
mean that the perceptual-conceptual 
distinction is a somewhat superficial 
matter, pertaining only to how the 
material is presented. It is possible, 


Type of content 


Structural 


Eduction of 

tural relations 
Seeing Trends II 
Correlate Comple 
tion II 


Eduction of patterns 
Circle Reasoning 
Letter Triangle 


struc- 


Conceptual 
Verbal comprehension 
Vocabulary 


Verbal classification 


Word Classification 
Verbal Classification 
Eduction of 
tual relations 
Verbal Analogies 
Word Matrix 


concep 


General reasoning 
Arithmetic Reasoning 
Ship Destination 


Sensitivity to problem 
Seeing Problems 
Seeing Deficiencies 
Conce ptual foresight 
Pertinent Questions 
Alternate Methods 


Penetration 
Social Institutions 
Similarities 


however, that in many of the items 
in this test the general shapes and 
sizes and other figural properties are 
an aid in classification. For example, 
there are cleaning implements, con- 
tainers, etc., in some items, where 
similarities of appearance may serve 
as clues. 

The difference between the Word 
Classification test and the Verbal 
Classification test is largely in the 
form of presentation of the problems. 
A sample item from the Word Classi- 
fication test is: ‘‘A. horse B. cow C. 
man D. flower.’” Which word does 
not belong? In the Verbal Classifica- 
tion test, two short lists of words are 
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given to establish two classes, e.g., 
animals and pieces of furniture. A 
longer list of words is given, each 
one of which must be marked as be- 
longing to one class or the other or 
to neither class. 

Is there likely to be a factor having 
to do with the seeing of classes when 


class membet ship depends upon struc- 


tural properties? Such a factor would 
We have much to 
learn regarding the scope of struc- 
tural ideas. Thus far, structural fac- 
found 
utilizing letters and very simple forms 


be reas mable. 


tors have been only in tests 
such as circles, dashes, and the like. 
One can raise the question whether 
mechanical conceptions, for example, 
belong in this class. There is also the 
question of where figural properties 
end and structural properties begin, 
where structural properties 
end and conceptual properties begin. 
We may actually have a continuum 
With 


including classes, fundaments, 


also. of 


here. respect to some cate- 
vories 
etc.) there may be a rapid transition 
from figural to conceptual, thus leav- 
ing no basis for a third factor. It is 
likely that the factors in any row of 
Table 1 are positively and sometimes 
even substantially Phe 
general correlations 
left for later 


correlated. 
question — of 
among factors will be 
discussion. 

We have a complete triad of fac- 
tors having to do with the seeing of 
relationships and tests to measure 
them that are similar except for con- 
tent. tests well 
known. A matrix test is essentially a 
two-dimensional eXx- 
amples of which may be found in the 
Raven Progressive Matrices 
In the test Seeing Trends II, we find 
the following tvpe of item: “anger 
bacteria camel dead excite.” The 
examinee is to name the letter trend, 
which, in this item, of course, is that 
the initial letters are in alphabetical 


The analogies are 


analogies test, 


series. 
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order from “‘a” to ‘‘e.”’ In the Corre- 
late Completion II test, an illustra- 
tive item reads: “am ma_ not ton 
tool -’’. what word should come 
next? Here it is not word meaning 
that is important but letter se- 
quences. In the Seeing Trends II 
test, likewise, the word meanings are 
of no significance. Presumably, an 
utilizing letters only 
would do as well as a measure of this 
factor. 

In the row of Table 1 pertaining to 
patterns or systems, we have three 
factors, but they are much more dis- 
parate in kind than usual in this ta- 
ble. The clearest example of an educ- 
tion-of-patterns factor is in the 
middle column. The Circle Reason- 
ing test, adapted from Blakey (9), is 
similar to the Marks test of Thur- 
stone and to the Spatial Reasoning 
test of the AAF (21). In a sequence 
of symbols the examinee must dis- 
cover the principle by which certain 
symbols are marked, then he must 
mark a new set accordingly. In the 
Letter Triangle test, the letters are 
arranged in a different alphabetical 
pattern in each item. The examinee 
must discover the pattern and show 
this by filling a blank with a letter. 

Under the figural category we find 
the spatial orientation, a 
well-known space factor. It is best 
the ability to become 
aware of the spatial order or arrange- 
ment of objects perceived visually. 

Until the system of cognition fac- 
tors was conceived, the writer had 
thought of spatial orientation as a 
purely perceptual ability rather than 
intellectual.’ Its place in the system 
is regarded as tentative. We may 
vet find another seeing-patterns fac- 
tor in which figural properties play a 
more obvious role than thes do in the 


analogies test 


factor ol 


detined as 


§ A perceptual factor is distinguished from 
an intellectual factor by the fact that no sym- 
bolic activity is clearly involved. 
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factor of spatial orientation. In a 
real sense, an orientation within a 
field of perceived objects is a pattern 
or system, where spatial arrange- 
ment, which includes the viewer, is 
the principle. Shapes and sizes of ob- 
jects, which play a more obvious role 
in the case of the other figural fac- 
tors, are of more indirect significance 
in spatial orientation. 

Under the conceptual category we 
find a factor that has been most dif- 
ficult to define. The best conception 
of it is that it represents an ability 
to define or structure problems. It 
has been a most consistent compo- 
nent of arithmetic-reasoning tests, but 
since such tests are psychologically 
complex, it has been difficult to de- 
termine just what aspect of solving 
problems of this type is the signifi- 
cant feature that requires the ability 
called general reasoning. By elimina- 
tion of many rival hypotheses, it 
is now rather clear that the factor 
pertains to the comprehension of the 
structure of a problem, at least of the 
arithmetical variety (19). Since such 
a structure is conceptual, the factor 
logically belongs in the column where 
it is placed in Table 1. The Ship 
Destination test is a special tvpe of 
arithmetical-reasoning which 
seems to come closer than any other 
to being a pure measure of the factor. 

In the next row of Table 1, for the 


test, 


discovery of problems, there is only 


one factor—sensitivity to problems, 
which is in the conceptual column. 
The appearance of this factor parallel 
to general reasoning in the row pre- 
ceding, emphasizes the well-known 
observation that it is one thing to be 
aware that a problem exists and an- 
other thing to be aware of the nature 
of the problem. The titles of the tests 
are quite descriptive. A sample item 
from the test Seeing Problems asks 
the examinee to list as many as five 
problems in connection with a com- 
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mon object like a candle. The test 
Seeing Deficiencies presents in each 
item the general plan for solving a 
given problem, but the plan raises 
some new problems. What are those 
problems? 

Whether we shall ever find parallel 
factors for seeing problems or de- 
ficiencies of figural and_ structural 
types remains to be seen. Problems 
of a figural type are faced in aesthetic 
pursuits such as painting and archi- 
tecture. Problems of a structural 
type might be faced in connection 
with spelling or the development of 
language. Tests pertaining to the 
seeing of problems have thus far pro- 
vided no figural or structural bases 
for problems. It should be relatively 
easy to test the hypothesis that such 
factors exist. If they do exist, their 
implications for everyday 
performance need further study. 

In the investigation of planning 
abilities (14, 15), two parallel fac- 
tors were found—perceptual 
sight and conceptual foresight where 
one Was expected. The Competitive 
Planning test was originally designed 
by the AAF psychologists as a test 
of foresight and planning (21). It 
requires the examinee to imagine that 
he is playing the game of completing 
squares by drawing lines. He plays 
for the two opponents and in each 
item he has to tell the maximum 
number of squares each opponent 
can complete under the rules of the 
The Route Planning test, an- 
other AAF product, is a type of maze 
problem. The examinee must 
which of alternative points will have 
to be passed through in going from 
the starting point to the goal. In 
both perceived lavouts are 
used. 

The test Pertinent Questions pre- 
sents in each item a need for a deci- 
sion and the examinee is asked to 
state what facts he should consider 


p< ssil le 


fore- 


game. 


say 


tests, 
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in reaching a decision. For cxample, 
a new graduate is offered positions 
in two different cities. What should 
be the deciding considerations? In 
the Alternate Methods test, a prac ti- 
cal problem is given, with available 
objects that may be used. The ex- 
aminee is to give several alternative 
solutions that he considers most ade- 
quate. 
Porteus has maintained that his 
maze measure 
sight. He well claim) support 
from the factor-analysis results just 
The 
measured by maze tests, however, is 
This ability 
may be important for the architect, 


series of tests fore- 


Can 
mentioned. 


tvpe of foresight 


of a concrete variety. 
the engineer, and the industrial-lay- 
It mav not be found re- 
abstract type of plan- 
find in the political 
strategist and the policy maker. So 


out planner. 
lated to the 
ning that we 
far as our results go, the maze test 
should by 
test oft 


no means be offered as a 
general This 
statement might modification, 
however, after the maze test is factor 
analyzed in 


intelligence 
need 
a population of lower 
eeneral intellectual level (where gen- 
eral intellige nee is detined oOperation- 
ally as an average ol all intellectual 
abilities). In a population of “high- 
level personnel,” we can say that a 
maze test measures most strongly the 
and, 
the 


and adaptive 


factor of perceptual foresight 


incidentally, to some degree 
factors of visualization 
flexibility (15). 

The appearance of a tactor called 
penetration in the column. ot 
Vable 1, along with conceptual fore- 
sight, calls for comment. A factor of 
penetration was hypothesized in the 
first analysis of creative abilities and 
was not found (31). An unidentified 
factor found there might well have 
A factor has been 
so identified in a more recent analysis 
that emphasized creative ability tests 


last 


been penetration. 
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(20). It is strongly loaded on a test 
called Social Institutions, which asks 
what is wrong with well-known in- 
stitutions such as tipping. It was de- 
signed as a test of sensitivity to prob- 
lems, and it has consistently had a 
loading on that factor. In the first 
creativity analysis, two scores were 
based upon this test; one being the 
total number of low-quality or obvi- 
and the other was the 
total number of high-quality or “‘pen- 
etrating”’ defects that can 
be seen only by the far-sighted per- 
son. As a matter of fact, the two 


ous defects 


defects 


scores had much to do with effect- 


ing a separation of the seeing-prob- 
tests two groups, 
which might have been identified as 
the penetration factor. 

It is quite possible that the factor 
of penetration and the factor of con- 
ceptual foresight are one and the same. 
They in two different an- 


alvses no crucial 


lems Into one ot 


came out 
that had 
common. It 


tests in 
would be a good hvy- 
pothesis that they are identical and 
a good predic tion would be that if the 
four tests listed in Table 1 were an- 
alyzed in_ the battery 
would define a single factor, not two. 


same they 

There is the apparent possibility 
for the existence of a foresight factor 
involving structural 
but the scope and usefulness of such 


arrangements, 


a factor would seem to be question- 


able. 


» J "/ 
Production Factor 


,7I0 
Ing 


Convergent Think 


The second large group of think- 
ing factors has to do with the produc- 
tion of some end result. After one 
has comprehended the situation, or 
the significant the 
moment, usually something needs to 


aspects of it at 


In the an- 
alowies test, for example, having seen 
the relation between the first pair ot 
elements of an item we must then 


be done to it or about it. 
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find a correlate to complete another 
pair. Having understood a problem, 
we must take further steps to solve 
it. 

Like the cognition factors, the 
production factors show some prom- 
ise of falling under the rubrics of 
figural, structural, and conceptual, 
but here the picture is less complete. 
The kinds of things produced, are 
more numerous than the kinds dis- 
covered. There are no identities of 
things in the two lists, but there are 
a few parallels or relationships. For 
example, corresponding to the com- 
prehension of words, there are factors 
concerned with the production of 
words; corresponding to the discov- 
ery of classes there is the act of nam- 
ing; corresponding to the discovery 
of relations there is the production of 
correlates; and corresponding to the 
discovery of systems there is the pro- 
duction of order. But with 
few instances, the connections 
parallels seem to end. 

It was announced earlier that the 
production factors fall into two 
groups —convergent-thinking fac- 
tors and divergent-thinking factors. 
Such a distinction seems not to have 
been emphasized in prior literature 
on thinking. In the case of some of 
the production factors, the distinc- 
tion is not complete, but in most cases 
it is striking. 

In convergent thinking, there is 
usually one conclusion or answer that 
is regarded as unique, and thinking 
is channeled or controlled in the di- 
rection of that answer. In tests of 
the convergent-thinking factors, there 
is one keved answer to each item. 
Multiple-choice tests are well adapted 
to the measurement of these abilities. 
In divergent thinking, on the other 
hand, there is much searching or go- 
ing off in various directions. This is 
most clearly seen when there is no 
unique conclusion. For the measure- 


these 
and 
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ment of such abilities, completion 
tests are almost a necessity. The 
distinction is not so clear in some 
problem-solving tests, in which there 
must be and usually is some diver- 
gent thinking or search as well as ulti- 
mate convergence toward the solu- 
tion. But the processes are logically 
and operationally separable, even in 
such activities. 

In Table 2 we have those produc- 
tion factors identified as dealing with 
convergent thinking. There are five 
potential triads of factors, depending 
upon the kind of result produced 
names, correlates, orders, changes, or 


unique conclusions. In two cases 


structural-type tests have figured in 
factors, thus a three-column matrix 
has been again adopted. 


In the first row are factors having 
to do with the production of names. 
The two factors there are again con- 
trasted in terms of the concrete- 
abstract dichotomy. They differ, 
also, by the fact that the one has to 
do with the naming of particulars 
while the other has to do with the 
naming of classes. French (8) lists a 
factor of naming, which has been 
called object naming here to distin- 
guish it from the factor of abstraction 
naming, which was just recently dis- 
covered. The appearance of a test 
of Color Naming under the rubric 
of “figural” calls for broadening the 
conception of this class to recognize 
color as a figural property. Classes of 
objects distinguished for their struc- 
tural properties are evidently not 
very common. If good examples can 
be found, we may find a third nam- 
ing factor. In the name of the factor 
of abstraction naming, the term “ab- 
straction’’ may prove to be too com- 
prehensive. The two illustrative 
tests mentioned might suggest that 
the ability is restricted to the nam- 
ing of classes. The results show that 
it is actually broader than that, since 
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TABLE 2 


PRODUCTION FACTORS—CONVERGENT 


Type of result 


sroduced - 
I Figural 
Names 


Object naming 
Form Naming 


Color Naming 


Correlates 


Educ tion of correlat 


THINKING 


Type of Content 
Structural Conceptual 
Abstraction naming 
Picture-Group Nam 
ing 
Word-Group Naming 


le 


Correlate Completion 


Figure Analogies Completion 


Orc lers 


Changes Visualization 


Pun hed Holes 


Unique conclusions 


Symbol subst 
S n ( nal 


Form Reasoning 


it pertains to the naming of relations 
also, in other tests. 

With three factors having to do 
with the seeing ol relationships, we 
might well expect three correspond- 
ing factors concerned with the educ- 
As a matter of 
tact, the project has for some time 
antic ipated at least two such factors, 
perceptual and conceptual, and has 
designed tests that were expected to 
effect the expected separation. To 
this date, eduction-of- 
correlates factor has been clearly in- 
dicated, and both figural and struc- 
tural tests have loadings on it. The 
Verbal Analogies Completion test, 
which we hoped would help to dis- 
tinguish a conceptual-correlates fac- 
tor, turned out to be a test of expres- 
sional fluency. Evidently the educ- 
tion-of-correlates aspect of the test 
was made so easy that little variance 
in this ability, if it is separate, was 


tion of correlates. 


only one 


Spatial Visualization 


Ordering 

Picture Arrangement 

Sentence Order 

Redefinition 

Gestalt Transforma- 
tion 

Object Synthesis 


Numerical facility 
Numerical 


tions 


Symbol manipulation 


Opera- = Symbol Manipulation 


Sign Changes II 


manifested. On the other hand, hav- 
ing educed the correlate, thinking of 
the needed word provided the chief 
basis for individual 
scores, and hence the loading on ex- 
pressional fluency. It can be pre- 
dicted that with the appropriate 
tests, three eduction-of-correlates fac- 
tors will become evident. 


differences in 


Because of 
the difficult, of separating them, it 
can be predicted that the intercor- 
relations of these three factors will be 
found to be substantial. 

In the investigation of planning 
abilities it was hypothesized that 
there would be an ability to see or to 
appreciate order or the lack of it, as 
a feature of preparation for planning. 
It was also hypothesized that there 
would be an ability to produce order 
among objects, ideas, or events, in 
the production of a plan. A single 
ordering factor was found. Since the 
three tests designed to measure sensi- 
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tivity to order had low and insignifi- 
cant loadings on the factor, while the 
three designed to measure the pro- 
duction of order had significant and 
even substantial loadings, the factor 
seems to belong among the produc- 
tion factors. The Picture Arrange- 
ment test presents a four-part car- 
toon strip in which the parts are out 
of correct temporal order. The ex- 
aminee has to state the best order. 
The Sentence Order test presents in 
each item three sentences, each stat- 
ing an event, the examinee being 
told to rearrange them. 

It remains to be 
ordering in terms of figural and 
structural properties will call for 
additional ordering factors to help 
complete the matrix of Table 2. 


seen whether 


Figural ordering may be a significant 
aspect of pictorial art. It 
easy to see where a structural order- 


is not so 


ing would be of consequence. 

In the next row of Table 2 we find 
the factor of visualization, which has 
been known for some time, and the 
factor of redefinition, which was found 
originally in the first creativity an- 
alysis (31). The thing produced in 
both instances is some kind of change 
or rearrangement or shift. The Spa- 
tial Visualization test is Part VI of 
the Guilford-Zimmerman Aptitude 
Survey. In each item certain move- 
ments of a pictured alarm clock are 
indicated and the examinee is to 
select the view that would be seen af- 
ter the movements. The Thurstone 
Punched Holes test shows a paper 
being folded and a hole or holes then 
cut out. The examinee is to tell how 
the paper would look after unfolding. 

The redefinition factor involves 
shifts of meaning or use of objects or 
parts of objects. The test Gestalt 
Transformation asks such questions 
as: With which of the following ob- 
jects could one best start a fire: A. 
fountain pen, B. onion, C. pocket 
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watch, D. light bulb, E. bowling ball? 
The keyed answer is C, since the 
crystal can be transformed from a 
face cover to a condensing lens. The 
Object Synthesis test asks such ques- 
tions as: Given pliers and a_shoe- 
string, what could you make? A good 
answer would be “pendulum” or 
“plumb bob.” In either case the ob- 
jects play new roles in the combina- 
tion. 

The last row of factors in Table 2 
presents an interesting triad. Al- 
though there are one or two questions 
that can be raised about their place- 
ment, to be mentioned later, it is 
quite clear that they all involve rigor- 
ous operations with symbols leading 
to unique conclusions. The factor of 
numerical facility is the very well- 
known ability to operate with num- 
bers, where both speed and accuracy 
are significant. The two new factors, 
symbol substitution and symbol man- 
ipulation, were regarded as one fac- 
tor until recently. In one analysis 
the factor looked like a substitution 
ability and in another analvsis it 
looked like a manipulation ability. 
In a (20) the two 
were found to be separate. 

To distinguish these factors, we 
must consider the different kinds of 
that represent the two. In 
Sign Changes, the examinee is told 
before each block of items what inter- 
changes to make in algebraic signs, 
€.g., “repl ice — with XX" and “re- 
place + with —.” He applies the 
new rules to several simple equations 
such as “3—6=?” and ‘6+2=?.” 
In the Form Reasoning test, equa- 
tions are stated in the form of com- 
binations of simple geometric forms. 
Some definitions are first given, stat- 
ing that a combination of two forms, 
such as a star and a circle, can be re- 
placed by another single form, a 
square. With these substitutions of 
single forms for pairs, combinations 


recent analvsis 


tests 








THE STRUCTURE OF INTELLECT 


greater than pairs must be reduced to 
single symbols, taking each pair in 
turn. 

It is difficult to accept fully the 
placement of symbol substitution in 
the figural column. If all tests loaded 
on it like Form 
where rigorous 


were 
the 

operations are all in terms of figures, 
the placement would be 
sonable. 


Reasoning, 
definitions and 
quite rea- 
But certain features of the 
Sign Changes test suggest that it is 
not figural properties, as such, that 
They 
merely to identity, the svmbols. In 
the Sign Changes test it is the opera- 


are important. may serve 


tion that the symbol stands for that 
is important 
The Sign Changes test was ori} 
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nally designed as a flexibility 
the Form Reasoning test was not. In 
both, the switch the 
meaning or significance of symbols is 
the obvious peculiar feature. Per- 
haps the emphasis should be placed 
on the word “‘switch.”” It may be 
that this factor will eventually be 
placed in the family of flexibility 
which appears in Table 3. 
There is no evidence against the hy- 
pothesis that symbol substitution is 
the the present factor 
adaptive flexibility, represented par- 
ticularly by the Match Problems 
test. As a matter of fact, Sign 
Changes had a significant loading on 


test; 


readiness to 


factors, 


same as of 


adaptive flexibility in the creativity 
31 g 


inalvsis Form Reasoning has 


PrABLI 


PRODUCTION FACTORS 


prod ct d 


Words 


Flexibility of closure 
Hidden Pictures 
Gottschaldt .\ 


Novel responses 


Details 


I 
I 
F 


—_ 
ahoratton 


lanning Elaboration 
igure Production 


* At present regarded as the same factor 


DIVERGENT THINKING 


lvpe of Content 
ructural Conceptual 
1 ssociational fluency 
Cor trolled 
tions II 
\ssociations ITI 


Associa- 


Ideational fluency 
Plot Titles 


Const quences 


Expressional fluency 

Vor al ular \ 
tion 

Simile ~ 


Compl 


{du ptive flexibility 
Match Problems 
Planning Air 


neuvers 


Spontancous flexibility 

Brick Uses 

Ma | nusual I ses 

Originality 

Plot Titles 
(cleverness) 

Symbol Production 


Elaboration* 
Planning Elaboration 
Figure Production 


rate factors. 
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never had an opportunity to show 
such a loading. 

Defining the factor of symbol man- 
ipulation are the two tests Symbol 
Manipulation and Sign Changes II. 
Symbol Manipulation provides some 
simply defined symbols, such as: E 
means equal to; NG means not 
greater than. Each item then pro- 
vides a statement such as: xEy and 
vNGz; which of the following state- 
ments can logically be made: xSz, 
xNGz, etc. This test was designed 
originally for the factor of Jogical 
evaluation Table 4), and has 
usually shown some relationship to 
that factor, but it also helps to define 
the factor of symbol manipulation. 

The test Sign Changes II presents 
simple ‘‘equations’’ such 1+2 
=4X1, the two sides of which are 
not actually equal as the statement 
stands. The examinee is to say what 
interchange of algebraic signs will 
make the equation correct. In the 
illustration just given, if KX and — 
are interchanged the equation will 
balance. 

From these two tests alone, it is 


(see 


as 
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not easy to see exactly what kind of 
ability is involved in common. One 
clue may be that both tests involve 
equations. A third test with a sig- 
nificant loading in one analysis is a 
number-series test. This test 
not involve equations. In an- 
the mnumerical-facility factor 
was distinct from symbol manipula- 
tion, consequently we cannot identify 
the latter with the former. Further 
intensive work is obviously needed 
in the area of these factors. Abilities 
that may be of some significance for 
in mathematics may 
found here. 


does 
one 
alvsis 


success be 


Production Factors—Divergent Think- 
ing 

The divergent-thinking factors are 
arranged in a matrix in Table 3 
with the three column categories that 
have now become familiar. Here 
there are more vacancies to be filled, 
if the system is indeed as applicable 
as it promises to be. 

In the first three rows of the table 
we find the four well-established flu- 
ency factors. In the first row are the 


TABLE 4 


EVALUATION FACTORS 


Type of Content 


Figural 


(Perceptual evaluation)* 
Ratio Estimation 
Figure Estimation 


Length estimation 
Pattern Assembly 
Shorter Path 


Structural 


Conceptual 

Logical evaluation 

Logical Reasoning 
Inferences 


Experiential evaluation 
Unusual Details 


Judgment 
Practical Judgment 
Practical Estimation 


Speed of judgment 
Color-Form Sort Time 
Social Judgments Time 


* Probably a composite of factors, including length estimatio: 
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two fluency factors having to do with 
the production of single words. In 
the case of the factor of word fluency, 
meaning is of no importance. The 


usual tests of this factor merely spe- 


cify that the words shall begin or end 
with a specified letter, prefix, or suf- 
fix. Only such structural require- 
ments are to be met. The examinee 
need not even know the meanings of 
the words he gives. In the case of 
associational fluency, however, mean- 
The 
words given must be synonyms, as 
in. Controlled 
must be related in some meaningful 
way to stimulus words or ideas. In 
Controlled Associations II, the 
aminee gives 


ing is an essential requirement. 


Associations II, or 


ex- 
as many as three syn- 
each stimulus word. In 
Ill, two words are 
given, differing in meaning, and the 
examinee must give one word that is a 
synonym to both. For example, the 
word “‘lie’’ would be given as a syno- 
nvm to both “‘recline”’ 


onvms to 
Associations 


and ‘‘deceive.” 

It does not seem very likely that 
an ability will be found for the first 
cell in Row 1 of the table. This would 
call for the production of words satis- 
fving specified figural requirements. 
Yet, tasks can be thought of to meet 
this case, for example, the writing of 
headlines, the production of esthetic 
effects with words, and so on. It does 
not seem likely, however, that there 
should have developed in) human 
makeup a unitary ability of this kind. 

The second row of the table offers 
some interesting possibilities. The 
speed of calling up ideas expressible 
in verbal form can be tested by dif- 
ferent kinds of tasks. 
amples of tests given were designed 
for the study of creativity. The Plot 
Titles test of fluency is scored by the 
total number of low-quality titles 
that can be suggested for a short 
story plot in a given time. The Con- 
sequences scored 


The two ex- 


test is similarly, 


but the responses are consequences 
foreseen as a result of some drastic 
change, such as everyone going blind. 

It can well be questioned whether 
fluency of verbal responses of such 
kinds is strongly related to fluency 
of ideas of a mechanical, or musical, 
or pictorial kind. Fluency tests have 
been commonly cast in verbal form. 
Fluency in the production of figures 
and fluency in the production of 
things distinguished by their struc- 
tural properties may well be separate 
factors, both distinct from the zdea- 
tional-fluency factor now known. The 
exploration of such possibilities would 
seem to be a fruitful route to take in 
the investigation of creativity. 

The separateness of the factor ex- 
pressional fluency from  ideational 


fluency indicates that the ability to 


have ideas and the ability to put 
them into words are different things. 
Since the examinee must state ver- 
bally his ideas in tests of zdeational 


fluency, it might be supposed that 


his ability to express himself is in- 
But 
apparently in such a test the expres- 
sional problem is not a serious one. 
We present other tests in which the 
idea is given and the examinee must 
put it into words, usually in more 
than one way. 


cluded or is also being tested. 


The expressional 
problem is then more difficult, the 
test giving us variance in the expres- 
sional factor. In the Vocabulary 
Completion test, a stimulus word is 
used in a brief context, enough to 
indicate its meaning, and the ex- 
aminee has to give the word. In the 
Similes test, the examinee must give 
more than one completion to a simile. 
In a Verbal Analogies Completion 
test, which was designed to measure 
another found that the 
the expres- 


factor, we 
leading variance is in 
sional-fluency factor. 
The only complete triad in Table 
3 is a set of flexibility factors, the 
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best-known of which is adaptive 
flexibility. The three factors involved 
are not clearly parallel in all respects. 
They have in common the feature 
that sudden shifts of activity occur— 
shift of organization of a figure, shift 
of set or approach to a problem, or 
shift of responses, re- 
spectively. Thurstone discovered the 
flexibility-of-closure factor in his an- 
alvsis of perception (28) and found 
that the factor had relations indicat- 
ing its intellectual importance. 

The most consistently representa- 
tive test of the factor of adaptive flexi- 
bility is the Match Problems test. 
This test is based upon the old, fa- 
miliar puzzle or game of removing a 
specified number of match sticks in 
order to leave a specified number of 
squares. In order to measure flexi- 
bilitv, the problem changes dras- 
tically from one item to the next, re- 
quiring very unusual solutions; solu- 
tions such as the 
would not expect. 
first 


category of 


average person 
For example, at 
the examinee is led to expect 
that the remaining squares will be 
of the same size, but there comes an 
item in which they 


equal size. 


must be of un- 
Another item requires 
that a smaller square be left within a 
larger one, and so on. 

In an unpublished study, a test 
involving Gottschaldt figures came 
out as strongly loaded on adaptive 
flexibility as did Match Problems. 
In the same analvsis, a test of In- 


sight Puzzles also had a similar load- 


ing. Thus, in this case, a perceptual, 


a structural, and a conceptual test 
had strong loadings on the same fac- 
tor. There is therefore the possibility 
that flexidility of closure and adaptive 
flexibility are one and the same factor 
and that this factor cuts across all 
three columns of the matrix. In an 
analysis where perceptual, structural, 
and conceptual flexibility tests are 
all liberally represented, however, it 
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can be predicted that three factors 
will be found. If so, they are prob- 
ably substantially intercorrelated. 
If there are three such factors, the 
factor of spontaneous flexibility would 
have to be moved to another row of 
the matrix to be replaced by a con- 
ceptual-adaptive-flexibility factor. 
The factor of spontaneous flext- 
bility has appeared persistently but 
never with great strength or sta- 
bility. The Brick Uses test, flexi- 
bility score, is the best clue to its na- 
ture. 
runs of responses. 


This score is the number of 
The examinee is 
told to name all the uses he can think 
of for a common brick, in eight min- 
utes. A 
quence of uses all of the same class, 
such 


“run’’ of responses is a se- 


as the use of bricks as building 
material or as missiles, and so on. The 
test Unusual Uses calls for listing 
several unconventional uses for each 
of a number of objects, the number 
given being the score. Since only 
verbal tests of this factor have been 
analyzed, nothing can be said regard- 
the that there are 
parallel factors involving figural and 
structural contents. 


possibility 


ing 


It is of some interest to attempt to 
relate spontaneous flexibility to other 
concepts in psychology. Essentially, 
it appears to be a disposition to avoid 
repeating one’s self. This suggests a 
relation to Thorndike’s concept of 
refractory phase or to Hull’s concept 
of reaction inhibition. A hypothesis 
to be tested would be that tests de- 
signed to measure individual differ- 
ences in tendency to show refractory 
phase of the Thorndikian type and 
tests to show degree of tendency to 
reactive inhibition indicate the same 


factor as do tests of spontaneous 


flexibility. 


The results continue to show that 
originality is operationally definable 
as the likelihood of giving uncon- 
ventional, clever, or remotely asso- 
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ciated responses to test items (30). 
It is measurable in terms of number 
of clever titles given to story plots, 
clever ‘punch lines’ for cartoons, 
remote consequences to events, and 
idiosyncratic word associations. In 
two analyses there has been oppor- 
tunity for a cleverness factor to sep- 
arate off from the rest, but this did 
not occur. While the factor thus 
seems to be a rather broad one, it 
may well be asked whether such a 
factor, measured only by means of 
verbal tests, is significantly related 
to original production in nonverbal 
activities such as graphic arts, music, 
or inventive engineering. 

We have had only one originality 
test that is at least partly nonverbal 

the Symbol Production test. This 
test another pur- 
pose, namely to test the hypothesis 
that there is a separate ability to 
symbolize ideas in terms of simple 
line drawings. Each item presents a 
“ring the bell, 
of which the two italicized words are 
to be represented by two symbols. 


was designed for 


statement, such as 


The score is the number of nouns and 
verbs svmbolized in the testing time 
The test is not entirely nonverbal, of 
course, although the thing produced 
is figural. There was a test 
(Line Drawing pro- 
duction of 


second 
requiring the 
line svmbols_ for 
adjectives in the same battery with 
the Svmbol Production test. These 
two tests might have given rise to a 
separate factor, but did not. 
Nevertheless, the writer is of the 
opinion that the problem of whether 
there are originality factors peculiar 
to nonverbal areas 
one. 


given 


they 


is still an open 
an abilits 
to provide details working toward 
completion, when a part or an outline 
is given. The test Planning Elabora- 
tion presents the bare outline of a 
plan to which details must be added 


The elaboration factor is 
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to make it effective. In the Figure 
Production test, a simple line is given, 
to which the examinee is asked to add 
lines to complete an object. The 
score depends upon the amount of 
detail added. 

Here we have a clearly verbal test 
and a clearly figural test (although a 
meaningful usually pro- 
duced) both with relation to the same 
factor. There is still the possibility 
that there are two (or three) elabora- 
tion factors, distinguished in terms 
of content, with enough relationship 
between them to cause the factors to 
appear to be one. It will take a new 
analysis in which at least three good 
figure-elaboration 


object is 


tests and three 
good verbal-elaboration tests (not to 
triad of structural-elabora- 
tion tests, also) should be included to 


many 


forget a 


how 
factors there are. 


determine elaboration 

Considering the factors in the di- 
vergent-thinking category together, 
it is obvious that the freedom to 
direction of thinking varies 
considerably from instance to 
another. Different degrees of situa- 


change 


one 


tion-imposed restriction are involved. 
But generally, within whatever lim- 
its that are imposed by external re- 
strictions, the need for rejecting or 


superseding a response and for try- 


ing or producing a new one is the 
common element in this group of fac- 
There is also a difference in the 
amount of self-imposed restriction or 
freedom. This depends upon the in- 
dividual rather than upon the situa- 
tion. 


tors. 


It is largely in this source of 
Variation that we find the divergent- 
thinking factors. 


Evaluation Factors 


Evaluation factors have to do with 
decisions concerning 


suitability, or 


the goodness, 
effectiveness of the 
results of thinking. After a discovery 
is made, after a product is achieved, 
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is it correct, is it the best that we can 
do, will it work? This calls for a 
judgmental step of some kind. It 
was our hypothesis in the project 
that the ability to make such deci- 
sions will depend upon the area 
within which the -thinking takes 
place and the criteria on which the 
decision is based. The results indi- 
cate several evaluation factors. They 
have been placed in the customary 
three-column matrix in Table 4, in 
spite of the fact that none have been 
found to fit the structural column. 
In this group of factors there is no 
good way of distinguishing rows. 
The domain of evaluation factors 
has been less well explored than the 
other intellectual domains. 

The least that can be said is that 
the perceptual-conceptual dichotomy 
applies in this area of abilities. Al- 
though our analysis showed only one 
factor applying to judgments of 
figural material, it is likely that in 
this subarea of evaluation alone there 
are a number of judgment factors. 
For this reason the factor of per- 
ceptual evaluation has been placed in 
parentheses in Table 4. For ex- 
ample, a more restricted factor of 
length estimation has been found (21). 
The search for such factors carries us 
over into the whole realm of psycho- 
physical judgment. It would be dif- 
ficult to say whether factors of this 
kind belong under the general head- 
ing of thinking or under the heading 
of perception. In view of the known 
complexity of psychophysical judg- 
ments in general, their place in the 
intellectual group can be defended. 

The best established evaluation 
factor is that of logical evaluation. 
This is defined as the ability to 
judge the soundness of conclusions 
where logical consistency is the cri- 
terion. The factor has sometimes 
been called ‘“‘deduction,’’ with the 
belief that it is the ability to draw 
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conclusions logically consistent with 
premises. If this were the case, the 
factor would belong with the produc- 
tion-factors group. Most tests in 
which the factor has been found to be 
a component are of the true-false or 
multiple-choice form, in which the 
examinee is given conclusions; he 
need not produce them. It is diffi- 
cult to say whether he actually does 
produce them for himself first then 
find them among the answers pro- 
vided. But whether he does this or 
not, he must necessarily make a 
judgment as to the correctness of the 
answer—his own answer or the ones 
given him. Even in a completion 
test, this step would be necessary. 
It seems preferable, therefore, to call 
the factor logical evaluation and to 
list it among the evaluation factors. 

It was hypothesized that there 
would be a factor in which evalua- 
tion is made on the basis of past ex- 
perience. Such a factor was found, 
and it is represented best by the test 
of Unusual Details. In this test the 
examinee is asked essentially “What 
is wrong with this picture,’ in which 
there are two features that are incon- 
gruous or inconsistent with common 
experience. In defining this factor, 
whether the emphasis should be 
placed upon the supply of past ex- 
perience or upon an ability to utilize 
that experience is not known. 

The factor called judgment is listed 
with some hesitation. It was found 
repeatedly, but rather weakly, in 
AAF research (21). It is best repre- 
sented by a test in which a practical 
difficulty was described and several 
alternative solutions are offered. 
Which one is best, everything con- 
sidered? In common terminology, 
the ability might be recognized as 
wisdom or common sense. In the ap- 
titudes-project research, there is evi- 
dence that this AAF judgment fac- 
tor may be the same as the one called 
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redefinition. If this is the case, it is 
not easy to say where to place the 
emphasis in defining the factor. 

The factor speed of judgment was 
found by Thurstone in his analysis 
of perceptual abilities (28). The 
speed with which the examinee com- 
pletes the sorting of objects accord- 
ing to color or form and the speed 
with which he checks traits that ap- 
ply to himself are both measures of 
the factor. It is thus shown as cut- 
ting across the three content cate- 
It might well be classed as a 
temperament trait rather than an 
ability. 


gories. 


Memory Factors 


doubt about the 
remaining 


little 
the 
under the heading of memory factors 


There is 
grouping of factors 
Collecting all such factors from vari- 
ous sources, we find that seven qual- 
ify for this category. 
alvsis by Kelley (27) has done much 
to verify and complete the picture 
for this group. 


A recent an- 


It is possible to or- 
factors in the three 
the familiar cate- 
gories as to content, and in three rows 


ganize these 


columns of now 
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as to the kind of thing or aspect in- 
volved Table 5). The titles of 
the tests representing each factor are 
usually quite descriptive. 

The best-known of the memory 
factors is rote memory; the ability to 
learn and to remember things asso- 
ciated, where meaning is of little or 
no importance. In the AAF research 
this factor called ‘associative 
memory” for the reason that paired- 
associate learning was typical of the 
tests of it. There was a need, also, 
of distinguishing it from the factor 
of visual memory, where sheer con- 
tent is important rather than associa- 
tive between contents. 
Since Kelley (27) has demonstrated 
another associative-memory factor 
in the form of meaningful memory, 
however, it seems best to return to 
the name of rote memory. The place- 
ment ol 


(see 


Was 


connections 


both in an associative row 
of the matrix indicates their common 
associative property. The vacancy 
under the figural heading in this row 
calls for the hypothesis that there is 
an undiscovered factor pertaining to 
the connec- 
tions between figural contents. 


learning of associative 


TABLE § 


\ MATRIX oF 


Thing or aspect 


remembered Ficural 
gura 


Associative connec 


tions 


Vi ual memory 

Re r lt és . De 

xeproduction oj ri 
signs 


Map Memory 


Auditory memory 
Musical memory 
Rhythm 


Span 


MORY FACTORS 


Type of Content 


Structural Conceptual 


Rote memory Meaningful memory 
Word-Number Sentence Completion 
Color-W ord Related Words 

Vem ry for ideas 


Memory for Ideas 


Limericks 


Memory span 
Letter Span 
Digit Span 


Integration I 
Signal Interpretation 
Combat Planes 
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The factor of o:sual memory has 
been known for some time (21). The 
factor may be regarded as a rather 
photographic-memory ability. Some 
individuals are recognized as stand- 
ing out in this respect, for example 
certain police officers who remember 
faces and motor-vehicle license num- 
bers remarkably well. In tests, the 
evidence of remembering of this type 
may be in the form of reproductions 
(Reproduction of Designs test), or 
recognition (an AAF Map Memory 
test), or verbal descriptions (an- 
other AAF Map Memory test). 

The listing of a factor with the 
name of auditory memory represents 
in part the writer’s somewhat risky 
hypothesis. It is based upon a tactor 
found by Karlin (26) in tests of musi- 
cal memory (for melody and rhythm). 
French (4) called it ‘musical mem- 
ory,”’ which is the cautious thing to 
do. The name ‘auditory memory” 
used here implies some confidence in 
the prediction that when nonmusi- 
cal auditory-memory tests are in- 
cluded with musical-memory tests 
in the same analysis, the same factor 
will apply to both. 

AAF research results hinted at the 
existence of a content-memory or 
substance-memory factor but did 
not demonstrate it. Kelley's results 
give evidence for such a factor. It is 


the memory for ideas, which are 


probably not expressed verbatim in 


recall tests. Further support for this 
factor is desirable. The hypothesis 
that there is a “‘content”’ factor in 
the structural column is still to be 
investigated. It is not easy to sav 
what this would be like. The mem- 
ory for a route might qualify. 
Memory-span tests, composed of 
digits and letters have in common a 
memory-span factor. This factor be- 
longs in the structural column. Inci- 
dentally, it is interesting that mem- 
ory-span tests have been rather popu- 
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lar components of general-intelligence 
scales. It turns out that they meas- 
ure primarily a rather special kind 
of memory ability whose social im- 
portance cannot be very great. Tel- 
ephone operators come to mind first 
in this connection. A general re- 
mark may be made, prompted by 
the emphasis upon memory-span 
tests as measures of intelligence, that 
although many tests correlate highly 
with chronological age, this does not 
ensure that they measure any very 
significant aspect of intelligence. 

In the conceptual column, Jnte- 
gration I, a factor found in AAF re- 
search, is proposed as a memory- 
span factor. The tests Signal Inter- 
pretation and Combat Planes re- 
quire the examinee to keep in mind 
a relatively large number of detailed 
rules for success in them. Kelley (27) 
had one span test in which the con- 
tent was in the nature of lists of 
tasks to be done, the length varying 
as in digit and letter-span tests. It 
came out with those other span tests 
on his memory-span factor. It can be 
predicted that if there were other 
idea-span tests, and perhaps some 
Integration-I tests in the battery, 
two span factors would be found.® 
The span factors are probably sig- 
nificantly correlated. The vacant 
cell in Row 3 of Table 4 suggests that 
the way is open for someone to see 
whether a third memory-span factor 
will be found where the contents are 
figural. 

To digress somewhat from an ac- 
count of the factors, it may be 
pointed out that the fact that there 
are several distinct memory abilities 
may explain some of the phenomena 
observed in memory experiments, 
particularly where results are dis- 
cordant. Results from memory ex- 

6 Another hypothesis is tenable with regard 


to Integration I, however. It might be identi- 
cal with the factor memory for ideas. 
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periments may differ markedly, some- 
times, depending upon the kind of 
material and the thing or aspect em- 
phasized. For example, the relative 
strength of backward vs. forward 
associations differs when the material 
is composed of visual forms or is com- 
posed of syllables. In transfer ex- 
periments, in view of the different 
abilities involved, it should 
surprising that transfers of 
in| memorizing skills 
limited. It would be interesting to 
test the hypothesis that transfer will 
be relatively greater between tasks 


not be 
gains 


should be sO 


that depend upon the same memory 
factor or upon the more strongly cor- 
related factors. The same hypothe- 
sis could be stated with respect: to 
thinking factors other abilits 


factors generally. 


and 


DISCUSSION 


The account of the known intel- 
lectual factors and the svstem into 
which they seem to fall calls for the 
discussion of some general questions. 
There implications for factor 
theory and for its application to psv- 
There 


are implications for general psycho 


are 
chological research in general. 


logical theory and for the prac tices of 
intelligence testing. 
Implications for Factor Theory 
Factor Analysis 


and 


A theory or a method should be 
judged by its fruits. If the results 
that have been reported here con- 
tribute to psychological understand- 
ing and, through that, to useful psy- 
chological practice, factor analysis 
has passed this kind of test. The 
mathematical model that has been 
applied, which conceives of individ- 
ual differences in intellectual per- 
formances as being represented by a 
coordinate system of 2 dimensions, 
has served certain purposes. While 
it may be shown at some future time 
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that the model is not the best that 
could be applied, its power to gen- 
erate new psychological ideas and to 
extend considerably the conception 
of the realm of intellect has been dem- 
onstrated. 

The average reader will no doubt 
be surprised by the large number of 
dimensions that seem to be required 
to encompass the range of intellectual 
aspects of human nature. Some 40 
factors are reported as being known 
many additional un- 
forecast. ‘This 
would seem to go against the scien- 


and a 
known 


great 


factors are 


tific urge for parsimony. 

The principle of parsimony has led 
us in the past to the extreme of one 
intellectual dimension, which everv- 
one should now regard as going too 
far in that There is ac- 
tually no fixed criterion for the satis- 


direction. 


faction of the principle of parsimony. 
In science we can satisfy the princi- 
ple to some degree whenever the num- 
ber of concepts is smaller than the 
number of phenomena observed. 
Forty, sixty, or even a hundred fac- 
tors would certainly be a= smaller 
number of concepts than the number 
of possible tests or the number of ob- 
servable types of activities of an in- 


tellectual character. In this sense the 


principle of parsimony has been satis- 
fied. 


The number of the factors is less 
unattractive when we find that they 
can be subsumed within a system 
that is describable by a smaller num- 
ber of categories or principles, as we 
have seen in the matrices of Tables 
1-5. Some readers will ask whether, 
since there are many probable inter- 
correlations among the factors, a 
small set of second-order factors will 
not suffice. Granting that we can 
make suthciently accurate estimates 
ot the intercorrelations among the 
factors, which the writer doubts that 
we can do at present, to use only sec- 
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ond-order-factor concepts would lose 
information. This follows from the 
fact that where m linearly independ- 
ent dimensions are necessary to de- 
scribe a domain geometrically, no 
one dimension can be entirely ac- 
counted for by combinations of the 
others. 

It may be asked whether some of 
the factors listed are not really spe- 
cific factors rather than common fac- 
tors. This is a legitimate question. 
It is not uncommon experience in fac- 
tor analysis to find what was _ for- 
merly regarded as a single common 
factor appears later to split up into 
two or more factors. The “splitting 
up” description is not completely 
accurate. It applies best to the fact 
that a group of tests having a “‘fac- 
tor’ in common later divide into two 
or more groups each defining its own 
common factor. In clear thinking 
about this phenomenon, we must 
keep in mind the distinction between 
“factor”? as a mathematical concept 
and “factor” as a psychological con- 
cept. The immediate results of a 
factor analysis are in terms of mathe- 
matical factors. Whether each math- 
ematical factor represents a_ single 
psychological factor or a combina- 
tion of psychological factors has to 
be determined by interpretation and 
by further experimental work ap- 
plied to the designing of new factor 
analyses. Eventually we reach the 
stage where further efforts to ‘‘split”’ 
a factor fail. Whether this has 
brought us to a specific factor in any 
particular case can be decided on the 
basis of a single criterion. Are the 
tests defining this factor essentially 
just different forms of the same test ? 
This cannot always be decided with 
certainty, but there is usually little 
difficulty in doing so. If we suspect 
that any factor is a specific, a new 
analysis that includes more obvi- 
ously different tests, but tests that 
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should measure the same 
factor, should be done. 
Skepticism 


common 


was expressed above 
concerning the operation of estimat- 
ing factor intercorrelations. This 
is a somewhat complicated problem 
for which there is as yet no good solu- 
tion. The common procedure in 
vogue at the present time for esti- 
mating factor intercorrelations is to 
do an oblique rotation of axes, lo- 
cate the primary axes and determine 


the cosines of their angles of separa- 


tion. The writer has preferred orthog- 
onal several 
Briefly, any particular oblique solu- 
tion to a factor problem is a function 


rotations for reasons. 


of several nonpsychological circum- 
stances. For one thing, it depends 
upon the kind of population tested. 
This is not so serious, but we should 
probably have a different set of fac- 
tor intercorrelations for age 
group, educational level, cultural 
milieu, etc., and for combinations of 
these. This lack of invariance pre- 
cludes making any very” general 
statements regarding the psycho 
logical interdependencies of factors 

A more matter is that 
oblique solutions depend upon the 
population of tests that we factor 
analyze. This is not merely a sam- 
pling problem, for the collection of 
tests in a battery is never a ran- 
domly selected one, and should cer- 
tainly not be. Much of this difficult, 
hinges on inadequacies of test con- 
struction and test administration. 
Rarely do we succeed well enough, 
either by test construction or by test 
administration, in exerting the ex- 
perimental controls it would take to 
come out with a score that is a pure 
measure of a factor. If two factors 
happen to be commonly loaded in the 
tests that define both of them, it 
would give the appearance of a fac- 
tor intercorrelation whether there 
was genuine correlation or not. This 


each 


serious 
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kind of result is not uncommon. Until 
we succeed in exerting better experi- 
mental controls in testing, we shall 
not have a very good basis for esti- 
mating factor intercorrelations, even 
for a population of ex- 
aminees. 


specified 


The question always comes up re- 


garding the origins of factors; are 
they inherited or are they acquired, 
to use the common, loose expression 
of this question. The reply is that 
factor analysis alone cannot answer 
this question. So far as factor analy- 
sis is concerned, the factors could all 
be hereditary in origin, or all environ- 
mental, all some weighted combina- 
tion of both 
ment, or 
some to 


heredity and environ- 
some due to the one and 
the other source. It will 
take experimental work of the usual 
tvpes to answer this question. But 
one thing is clear. The question ‘Is 
intelligence inherited or is it) a 

quired” makes less sense than it ever 
did. Such a question must be asked 
regarding each and every factor. Fer- 
guson (4) has expressed the interest- 
ing hypothesis that 
the 
transfer of learning. 


factors are a 


consequence of principles ot 
Many of them 
mav be, toa large extent. The Fergu- 
son hypothesis is akin to a similar one 
expressed earlier in this paper. 

In connection with origins of fac- 
tors, there is also the question of 
when in child development the fac- 
make their appearances. To 
the extent that factors are developed 
by experience, they would appear at 


tors 


such ages as the effects of experience 
have sufficiently crvstallized. To the 
extent that heredity is chiefly re- 
sponsible for the differentiation of 
factors, their appearances should be 
detectable when maturation effects 
their differentiation. In either case, 
the answer is to be determined by ex- 
perimental testing and factor analv- 
sis at all age levels at which suitable 
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tests can be administered. Such an- 
alyses should be done in populations 
very homogeneous with respect to 
age and other features. It can be 
predicted that the structure of the 
intellectual factors for children will 
be found simpler than that for adults. 
It can also be hypothesized that the 
structure for generally superior adults 
will be found more complex than for 
generally inferior adults. 


Implications for Psychological Theory 


It was suggested earlier that al- 
though psychological factors are vari- 
ables individual differences 
they also indicate psychological func- 
tions within individuals. It is there- 
fore in order to take the factors seri- 
ously as starting points for psycho- 
logical theory. 


among 


There has never been developed a 
comprehensive theory of thinking. 
We have been short of the essential 
concepts needed in the construction 
of such a theory. In view of the great 
variety of thinking abilities (and 

revealed by factor analy- 
time-honored concepts of 
induction, deduction, and 
the like appear even more inadequate 
than before. It seems to be of little 
value to attempt to relate the factors 
to those categories. The factors, in- 
stead, have generated their own cate- 
gories, which have been already pre- 
sented. They are essentially opera- 
tional concepts, since, like factors, 
thev refer back to the kinds of tests 
from which factor definitions were 
inferred. 

Although the general picture of the 
thinking factors is not vet sufficiently 
complete or certain to suggest an 
obvious, general theory of thinking, 
the kind of theory that they will 
eveltually generate can be seen. 

It is fairly well agreed that think- 
ing is symbolic behavior. It is not 
surprising, then, that certain factors 


functions 
sis, the 


reasoning, 
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have to do with symbols, as such, and 
with their utilization and manipula- 
tion. Of all the kinds of symbols 
available to humans in almost any 
culture, words and numbers are 
among those of greatest importance. 
The factors reflect these facts. 

In the operations of thinking, of 
realistic thinking, in particular, the 
factors indicate the important steps 
or processes of discovery, produc- 
tion, and evaluation, often occurring 
roughly in that temporal order. Di- 
vergent thinking may come into the 
picture along with these other phases, 
and auxiliary to them, particularly 
when they proceed with some diffi- 
culty. Some divergent-thinking pro- 
cesses are also likely to occur in non- 
realistic thinking, when one is simply 
free to do so and finds it rewarding. 
Since realistic thinking ts usually 
convergent, particularly when there 
is one right answer, at times there 
may be conflicting divergent-con- 
vergent tendencies, a phenomenon 
that has not been reported, to the 
knowledge of the writer. 

Quite generally, it seems, the 
thinking processes of a person may 
proceed more or less ably depending 
upon the kind of content with which 
he is involved—perceived figures, 
recognized structures, or conceived 
meanings. The distinction that has 
sometimes been made between con- 
crete thinking and abstract thinking 
has foreshadowed the major distinc- 
tion here; the distinction between 
figural factors and conceptual fac- 
tors. The appearance of the third 
category—structural—came as a sur- 
prise. If it turns out to be important, 
we have several interesting implica- 
tions. 

One practical implication of the 
structural category is that tests 
based upon letter material and the 
like may be of limited significance, 
if in reality we are interested in pre- 
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dicting behavior that depends upon 
factors in the figural or conceptual 
columns. A more important implica- 
tion has to do with the fact that 
there is a shortage of known factors 
in the structural column. A rather 
direct reason for this may be that 
there has been a bias toward figural 
and verbal test material, with an 
unfortunate slighting of structural 
material. This would not be so un- 
fortunate if it turns out that in our 
civilization not many such factors 
exist, or if they do exist they are of 
relatively little social importance. It 
may be that there is actually more 
structural-type thinking going on 
than we realize and that both 
chologists and educators have failed 
properly to recognize it. In a highly 
technical age, such thinking would 
seem to be important. We might well 
ask ourselves whether we have over- 
looked something of importance in 
this general area. 

The headings of rows in Tables 1-3 
present an unusual list of concepts, 
which appear to be more epistemo- 
logical than psychological. Is this 
possibly the kind of concepts that 
we have needed? It may be possible 


psy- 


to give some of them more psycho- 


but at 
the kinds ot 
know and 
terminology 


logical terminology — later, 
present they refer to 
things that we 
produce. If such 
scribes behavior in a significant and 
useful manner, it should be wel- 
comed and its worth should be recog- 
nized. One implication is that the 
lists seem to be open to new addi- 
tions. Consideration of what cate- 
gories might be added to the lists 
might turn up some new fruitful 
hypotheses regarding unknown fac- 
tors and functions. 

The subject of problem solving 
has come into considerable promi- 
nence in recent years. The picture of 
the thinking factors has important 


can can 


de- 
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We 
find that there is no one factor that 
This 
Problem solving is us- 
ually a complicated process. It is 
clearly indicated that we should stop 
looking for any one function or pro- 
cess that is the sime qua non of all 
problem solving. As the writer has 
pointed out elsewhere, many factors, 
including perceptual factors as well 
as thinking factors, may be called 
into play, depending upon the nature 
of the problem (12). 

In the list of thinking factors we 
find one factor having to do with the 
ability to recognize that a problem 
exists and another that per- 
tains to the diagnosis of the problem. 
The degree of 


implications for problem solving. 


can be called problem solving. 
is significant. 


factor 


either 
factor is still to be determined. So 
far as we know now, either may be 


generality. of 


restricted to a relatively narrow cate- 
gory of problems. The next steps in 


the attack on problem solving should 


be to make a survey of the variety of 
problems that are common and to at- 
tempt to write specifications regard- 
ing the factorial abilities that play 


solution of 
We should 
then test these hypotheses by experi- 
mental and factor-analytic preced- 
ures. 

At the beginning of the aptitudes- 
project investigation of creativity it 
was hypothesized that certain spe- 
cial, creative factors would be found, 
a few of them being then already 
known, some not. The results have 
supported most of the hypothesized 
factors but not all (20, 31). Because 
these factors were investigated within 
the arbitrarily designated domain of 
creativity, there has been a tendency 
to think of them as being the exclu- 
sive creative factors. This concep- 
tion is not fully correct. Creative 
thinking, like problem solving (they 
may actually overlap in many cases), 


the 
each type of problem. 


significant roles in 


STRUCTURE OF INTELLECT 


289 


depends upon different combinations 
of factors, and the combination of 
factors significant to the task will 
vary from time to time. The problem 
confronting us here, as with problem 
solving, is to recognize the main cate- 
gories of creative production and to 
seek the significant combinations of 
factors involved in them. Although 


certain factors such as ideational 


fluency and originality will carry rela- 


tively more weight, other factors not 
obviously creative may often be sig- 
nificant, as when an invention de- 
pends upon thinking by analogy or 
upon visualization. 

Thinking has many connections 
with learning, and hence the thinking 
factors are of some importance in 
learning investigations and learning 
theory. Thinking is sometimes re- 
garded as a form of learning, for while 
we think we usually learn. Another 
view of the connection is that think- 
ing contributes to learning. The lat- 
ter view is more productive of ap- 
proaches to investigation of the role 
of factors in learning. It is not 
enough to conclude that thinking 
contributes to learning or even to 
state and to test this as a general hy- 
pothesis. The questions raised here 
should be ‘‘Where and how does fac- 
tor X contribute to learning?” just 
as it was asked in the preceding para- 
graphs where and how each factor 
contributes to problem solving and 
creative activity. Since problem 
solving and creative activity are 
properly regarded as instances ot 
learning, we need only generalize the 
question to make it apply to all learn- 
ing. Fleishman and Hempel (5, 6, 
7) have already provided some ex- 
cellent demonstrations of the roles of 
factors at different the 
learning process for certain psycho- 
motor tasks. This type of investiga- 
tion should be applied more gen- 
erally. Certainly we should have 


stages in 
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that 


’ 


outgrown the ghb definition 
“intelligence is learning ability.’ 

The distinction between associa- 
tive and content-memory factors re- 
minds us that not enough attention 
is generally paid to the same distinc- 
tion in studies of learning and mem- 
ory. Learning theory has restricted 
itself almost entirely to the forma- 
tion and retention of associative con- 
nections, leaving out of account the 
learning of substance. 

Speaking of learning suggests the 
practical operation of education. At 
some future time factors should have 
much effect upon educational prac- 
tices, in addition to those effects hav- 
ing to do with assessment. If train- 
ing and experience have much to do 
with the development of the factors, 
it is important to know the factors 
and to determine the procedures 
whereby their development can be 
promoted by education. 

There are many possible relation- 
ships of the intellectual factors to 
pathology. Defects of memory and 
thinking are common occurrences in 
connection with intellectual 
that are associated with organic and 
functional pathologies. If we find by 
observation and by experimental 
study that defects tend to be along 
the lines of the intellectual factors, 
we have another source of evidence 
for the validity of the factors as func- 
tional unities. In practice, the use of 
measures of the factors may be help- 
ful in providing more accurate and 
meaningful assessment of intellectual 
losses. Losses described in terms of 
the factor concepts may help in un- 
derstanding the types of pathology, 
and in providing better definitions 
and diagnostic criteria. 


losses 


Intelligence and Intelligence Tests 


A treatment of the factors of in- 
tellect would be incomplete without 
considering their implications for the 
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concept of intelligence and for the 
present and future of intelligence 
testing. Is the concept of intelligence 
still useful? What is the nature of 
current intelligence tests in terms of 
factors? What should the future 
trends in intelligence testing be? 

As to general terminology, the 
term “‘intellect’’ can be meaning- 
fully defined as the system of think- 
ing and memory factors, functions, 
or processes. The term “‘intelligence”’ 
has never been uniquely or satis- 
factorily defined. Factor analysis 
has fairly well demonstrated that it 
is not a unique, unitary phenomenon. 
A “general factor,” found by what- 
ever method, is not invariant from 
one analysis to another and hence 
fails to qualify as a unity, independ- 
ent of research circumstances, as 
Vernon well stated (29). The 
methods of multiple-factor analysis, 
which have been chiefly responsible 
the factors listed 
ubove, do not find a general psycho- 
logical factor at the first-order level 
and they find no second-order factor 
that can properly lay claim to the 
title of “intelligence.” 

The term “intelligence” is useful, 
none the less. But it should be used 
in a semipopular, technological sense. 
It is convenient to have such a term, 
even though it is one of the manv 


has 


for discovering 


rather shifty concepts we have in ap- 


plied psychology. It would be very 
desirable, for purposes of communi- 
cation and understanding, to specify 
a number of intelligences—intelli- 
gence A, intelligence B, and so on. 
This could be done in terms of the 
combinations of certain intellectual 
factors and their weightings in the 
combinations. 

We have such combinations now in 
connection with the intelligence tests 
and scales in common use. Let us 
consider what kind of combinations 
we have in two of the most used in- 
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telligence scales. A really good factor 
analvsis of the Stanford Revision of 
the Binet scale would be rather diff- 
cult, and cannot be done satisfac- 
torily without adding to the analvzed 
battery a liberal number of reference 
tests. This has never been done. The 
best analyses that we have were done 
by Jones (24, 25), who found ten fac- 
tors among 30 selected items. His 
resulting picture is not clear because 
among the 30 items were essentially 
alternate forms of tests (at different 
age levels) and no outside reference 
tests were used. A fully satisfactory 
analvsis of the Stanford-Binet items 
would undoubtedly reveal more than 
ten factors present. 

It should be noted that 


many factors are present, 


when so 
i composite 
all the items can 
measure each component only to a 
small 


score based upon 


degree, if thev are nearly 


equally weighted in the composite. 
It can also be predicted that the 
factorial composition of the Binet IO 
will be found to vary somewhat from 
one age level to another. This feature 
mav contribute to a small extent to 
obtained changes in IQ where sub- 
stantial age differences are involved. 

As it actually happens, a Stanford- 
Binet 1Q, or any 1Q trom a test whose 
components are predominantly ver 
bal, is a total score heavily dominated 
by the verbal-comprehension 
This the 
little or no effective voice in the com- 
posite, even though they are repre- 
sented in the scale. 


factor 


leaves other factors with 


In nonverbal in- 
telligence tests, there is likely to be 
less domination by any one factor, 
but the nature of the composite Var- 
ies considerably from battery to bat- 
tery. 

Analvses of the « 
Wechsler- Bellevue scale also 
been generally inadequate. The most 
adequate analvsis has been done by 


Davis (3), who utilized a number ot 


ompoiuents of the 


have 
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reference from outside the 
Wechsler battery. He found nine 
common factors, six of which are 
probably to be identified with factors 
in the intellectual list. Where stand- 
ard tests of intelligence are widely 
used, it becomes increasingly impor- 
tant to attempt to write the specifi- 
cations for their total scores as well 
as their part scores, so that obtained 
individuals may be most 
meaningfully interpreted. 

tests will probably 
continue to be used for some time to 
come much as they are. In order to 
use them most wisely and to extract 
the greatest amount of information 
from their scores, the specification of 
such scores in terms of known factors 
is One Important improvement that 
could be made 


tests 


scores ol 


Intelligence 


The other great step 
toward improvement in intelligence 
testing would be to emphasize more 
than at present some of the socially 
important that have to do 
with thinking. The 
knowledge of the factors of this kind 
and of the kinds of tests that meas- 
ure them is largely available. Only 
by this kind of extension of intelli- 


testing 


factors 
productive 


gence can we do adequate 


justice to adult, human intellect. 
Other extensions may also be very 
useful, for we are a long way from 
complete coverage of the intellectual 
factors in present tests. For differ- 
ential prediction, and this includes 
the operation of vocational guidance, 
only single-factor scores will do com- 
plete justice in the description of in- 
dividuals. As a necessary prelude to 
to the use of factor measures for such 
purposes, we need innumerable vali- 


dation studies in which factors play 
an important 


those by Hills and others (23, 18). 


role, studies such as 
SUMMARY 


A listing of the factors that can be 
regarded as intellectual was made, 
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including those reported in French's 
summary of factors (8) appearing in 
1951 and those reported since that 
time. Of approximately 40 such fac- 
tors, seven are memory factors and 
the remaining ones have to do with 
thinking. 

An attempt was made to formulate 
a system into which the factors seem 
to fall. The thinking factors were 
categorized under the general head- 
ings of cognition (discovery), produc- 
tion (convergent thinking and diverg- 
ent thinking), and evaluation. The 


factors in each group can be arranged 
according to three kinds of content of 
thinking—figural, structural, and con- 
ceptual. In the cognition and produc- 
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In the general discussion, implica- 
tions of the factors and their system 
were pointed out for factor theory 
and practice, for general psychologi- 
cal theory, and for the concept of in- 
telligence and practices of intelligence 
testing. 
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It is by now generally recognized 
that all forms of psychotherapy vield 
successful results with some patients 
and that these successes depend to an 
undetermined extent on factors com- 
mon to many types of relationship 
between patient and therapist. This 
poses a knotty problem for propo- 
nents of various specific forms of psy- 
chotherapy who are convinced that 
their result from their 
particular theory or technique and 
wish to convince others of this. Asa 
result, problems of research design 
in psychotherapy have been receiving 
more and more critical attention in 
recent vears, especially with reference 
to controls (6, 11, 20, 23, 24, 25, 27, 
31, 34, 35, 38, 39). 

Certain general aspects of the psy- 
chotherapeutic relationship seem very 
similar to those responsible for the so- 
called placebo effect, which is well 
known to investigators of the thera- 
peutic efficacy of medications. The 
purpose of this paper is to describe 
the placebo effect, discuss some of its 
implications for the evaluation of 
psychotherapy, and make some rec- 
ommendations research 
design in psychotherapy based on 
these considerations. 


successes 


concerning 


THE PLACEBO EFFECT 


We have now participated in two 
separate investigations of the effec- 
tiveness of drugs on the symptomatic 


distress of psychiatric outpatients 
(14, 22). Both studies involved the 
administration of a placebo, an inert 
agent outwardly  indistinguishable 
from the agent being tested, as well 
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as drugs. The physician never knew 
whether he was giving the patient 
drug or placebo. The patients were 
told that a new medicine had become 
available which, it thought, 
might help them. The physicians 
rated symptoms on a 4-point scale ot 
distress, with high reliability. © In 
both studies a significant reduction of 
distress accompanied the taking of 
placebos, as shown in Table 1. 

This phenomenon occurs with great 
regularity, not only with respect to 
the kinds of symptoms usually asso- 
ciated with psychologic illness, but 
with others as well. For example, in 
a study of vaccines for the common 
cold, there was found a reduction in 
the number of vearly colds of 55 per 


Was 


cent among those given vaccine and 
of 61 per cent among a control group 
who received injections of isotonic 
sodium chloride solution (4). Hillis 
(15) found placebos as effective as 
other agents in inhibiting the cough 
reflex. Wolf and Pinsky (37) studied 
medical outpatients suffering from 
peptic ulcer, migraine, muscle ten- 
sion, headache, and tight muscles in 
the extremities. All were also tense 
and anxious. Twenty to thirty per 
cent felt better while taking placebos. 
Lasagna et al. (19) ml. of 
saline by subcutaneous injection to 
surgical patientssuffering from steady, 
severe wound pains and found that 
30 to 40 per cent reported a satis- 
factory relief of pain. In a study by 
Jellinek (18) 60 per cent of 199 sub- 
jects with chronic headaches received 
relief from a placebo on one or more 


gave 1 


occasions. 
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rABLE 1 


SYMPTOM 


Study Drug tested 


Mephenesin 


Reserpine 


Ist study 
2nd study 


The placebo effect is not always 
favorable, but may also result in un- 
desirable, distressful reactions. As 
far back as 1933, Diehl (3) using lac- 
tose placebos as a control fora variety 
of medications taken by meuth, 
found that some of his subjects re- 
ceiving placebos developed nausea, 
faintness, and diarrhea. Sometimes 
this ‘toxic response’ to placebos may 
even aitain major proportions. Wolf 
and Pinsky (37) tell of one patient 
who had ‘‘overwhelming weakness, 
palpitation, and nausea within 15 
minutes of taking her tablets.” In 
another, ‘‘a diffuse itchy erythema- 
tous maculopapular rash developed 
after ten days of taking pills. A skin 
consultant considered the eruption to 
be typical dermatitis medicamentosa. 
After use of the pills was stopped, the 
eruption quickly cleared.” A third 
patient developed epigastric pain fol- 
lowed by watery diarrhea, urticaria, 
and angioneurotic edema of the lips 
within ten minutes of taking her pills. 
One of our own patients, who had 
been tolerating a chronic syphilo- 
phobia fairly well, became acutely 
agitated shortly after placebo inges- 
tion, bemoaning what the pills had 
done to him, and required hospitaliza- 
tion shortly thereafter. 

Wolf and Pinsky (37) found that 
placebos produced improve- 
ment in subjective than objective 
manifestations of anxiety and ten- 
sion, but objective changes also oc- 
cur. In our second study (22), 69 
per cent of our patients showed de- 


more 


DISTRESS BEFORE EXPERIMENTS AND AFTER A TRIAL ON PLACEBOS 


Mean distress scores 


After 
placebo 


Before 
experiment 


Significance 
of ditference 


) 


58 
OO 


15.88 
24.69 


01 
.02 


34 


creased blood pressure and pulse 
readings following placebo, 19 per 
cent showed blood pres- 
sure, and 25 per cent showed a rise in 
pulse rate. Wolf (36) demonstrated 
clearly and convincingly that actual 
end-organ changes can follow placebo 
administration. This demonstration 
was made in a series of studies on the 
now-celebrated Tom, a human sub- 
ject with a large gastric fistula, in 
whom it was possible to observe di- 


increased 


rectly the gastric mucous membrane, 
correlating changes in color and tur- 
gidity with measure- 
ments of gastric secretion and motor 
activity. 

The placebo effect may actually 
reverse the normal pharmacologic 
iction of a drug. For example, Wolf 
reports that Tom was repeatedly 
given Prostigmine, which induced 
abdominal cramps, diarrhea, as well 
as hyperaemia, hypersecretion, and 
hypermotility of the stomach. Sub- 
sequently, the same response oc- 
curred not only to tap water and lac- 


simultaneous 


tose capsules, but also to atropine 
sulfate which usually has an inhibit- 
ing effect on gastric function. <A 
pregnant patient with excessive vom- 
iting showed the usual response of 
nausea and vomiting to ipecac. These 
manifestations were accompanied by 
cessation of normal gastric contrac- 
tions. When ipecac was given 
through a tube with strong assurance 
that it would relieve her vomiting, 
gastric contractions were resumed at 
the same interval after ingestion of 
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the drug that they would normally 
have ceased, and her and 
vomiting were relieved. 


nausea 


The placebo effect, in short, can 
be quite powerful. It can  signifi- 
cantly modify the patient’s physio- 
logical functioning, even to the ex- 
tent of reversing the normal pharma- 
cological action of drugs; and, as will 
be discussed below, it may be endur- 
ing. Placebo effects cannot be dis- 
missed transient. 
They often involve an increased sense 
of well-being in the patient and are 
manifested primarily by relief from 
the particular symptomatic distress 
for which the patient expects and re- 
ceives treatment. Thus, the relief 
of any particular complaint by a 
given is not. sufficient 
evidence for the specific effect of the 
medicine on this complaint unless it 
can be shown that the relief is not 
obtained as a placebo effect. 


as superficial or 


medication is 


IMPLICATIONS OF THE PLACEBO 
EFFECT FOR RESEARCH IN 
PSYCHOTHERAPY 

The giving of anv medication may 
have certain meanings for a patient 
in terms ot his relationship to his 
physician which may benefit his con- 
dition irrespective of the pharma- 
cological action of the drug. For ex- 
ample, it may relieve the anxiety re- 
sulting from the distress caused by 


his illness (10). Wolf believes the 


effects of placebos on his patients 
“depended for their force on the con- 
viction of the patient that this or that 
effect would result.” 
the patient’s conviction might be ex- 
pected to be influenced by his previ- 


The degree of 


ous experiences with doctors, his 
confidence in his physician, his sug- 
gestibilitv, the suggestibility-enhanc- 
ing aspects of the situation in which 
the therapeutic agent is being ad- 
ministered, and his faith in or fear of 
the therapeutic agent itself. These 
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attitudes are obviously relevaut to 
psychotherapy. 

Psychotherapists have theories of 
personality and psychotherapy and 
plan their therapeutic actions in the 
belief that these are the active agents 
which produce the desired results. 
Any favorable changes in patients 
consequent to a course of psvchother- 
apy tend to be cited as evidence for 
the validity of the theory of person- 
ality and neurosis which underlie the 
rationale of the psychotherapy. In 
view of the above discussion it may 
well be that the efficacy of any par- 
ticular set of therapeutic operations 
lies in their analogy to a placebo in 
that they enhance the therapist’s 
and patient’s conviction that some- 
thing useful is being done. Patients 
entering psychotherapy have various 
degrees of belief in its efficacy, and 
this may be an important factor in 
the results of therapy, but this has 
not been studied, to our knowledge. 
We know that the authoritarian at- 
titude of the physician can produce 
this conviction in some patients. 

At first glance the attitudes found 
by Fiedler (8, 9) to characterize ex- 
perienced psychotherapists, viz. feel- 
ings of empathy for and closeness.to 
the patient, an undemanding atti- 
tude, security, and the ability to ‘‘un- 
derstand” the patient, dia- 
metrically opposed to the authoritar- 
ian attitude. It mav be, however, 
that the therapeutic efficacy of these 
attitudes lies primarily in their abil- 
ity to increase the confidence of cer- 
tain patients in the ability of the 
therapist to help them. Lack of such 
confidence may be one of the reasons 
why patients of lower socioeconomic 
status fare less well in psychotherapy 
than patients higher in this scale (16, 
29), a talking therapy seeming to be 
beyond their comprehension and con- 
trary to their conception of the doc- 
tor-patient relationship. 


seem 
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In this connection, the role of sug- 
gestion in psychotherapy has been 
emphasized for years, especially in 
therapies utilizing hypnosis, but sug- 
gestion effects have been thought 
by many since Freud to be superficial 
and transitory. We know of no ex- 
perimental study which demonstrates 
that therapeutic effects based on in- 
sights or perceptual reorganization, 
which may also be suggested, are less 
superficial or less transitory. 


It may be pointed out parentheti- 
cally that conviction of the helpful 
ness of therapy need not be equated 
with “motivation for therapy,” which 
Was investigated by Grummon (13 
and Dymond (5) and tound to have 
little relationship to success in psy- 


Patients are often suf- 
to be strongly 
motivated to receive help, vet have 
little faith that a procedure such as 
psychotherapy can help them. 

The similarity of the forces operat 
ing in psychotherapy and the placebo 
effect may account for the high con- 
sistency of improvement rates found 
with various therapies, from that 
conducted by physicians without psv- 
chiatric training to intensive psycho- 
analysis (7). This explanation gains 
plausibility from the fact that re- 
ported improvement rates for vari- 
ous series of neurotics treated by dif- 
ferent forms of psychotherapy hover 
around 60 per cent (1). This is the 
same as that reported for the placebo 
effect in illnesses in which emotional 
components may play a major role 
such as (3) and 
(18). 

To show that a specific form of 
treatment produces more than a non- 
specific placebo effect it must be 
shown that its effects are stronger, 
last longer, or are qualitatively dit- 
ferent from those produced by the 
administration of placebos, or that it 
affects different types of patients. 


chotherapy. 


ficiently distressed 


“colds” headaches 
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Our knowledge of all these matters 
is still fragmentary, but some begin- 
nings have been made. 

With respect to the strength and 
qualitative nature of the effects of 
therapy, one line of endeavor has 
been study the physiological 
changes occurring during psycho- 
therapy. Since physiological meas- 
ures usually used to provide evidence 
of resistance or frustration (26, 33) or 
similar psychological states during 
psychotherapy (28) may also be in- 
fluenced by the placebo effect, one 
cannot conclude that demonstration 


to 


of such physiological changes implies 
a greater depth ol therapy or a more 
profound reorganization of the per- 
sonality, willing 
equate the placebo effect with such 
reorganization. 

With to the duration of 
improvement, if it could be shown 
that the placebo effect is of shorter 


unless we are to 


respect 


duration than changes specific to a 
given psychotherapy, this would pro- 
vide one kind of evidence favoring 
that theory of psychotherapy. As 
far as we know, no study of the limits 
of duration of the placebo effect has 
been made. Our experiment with 
mephenesin vs. placebo covered tour 
two-week periods. Figure 1 shows 
the curves for both agents for the 
eight weeks. 

Figure 1 shows that the greatest 
decrease in distress following place- 
bos was felt during the first two-week 
trial period. After that, a slight but 
statistically insignificant rise in dis- 
tress occurred; and, at the end of 
eight weeks, the placebo effect was 
about as after two weeks. 
Unfortunately, our data vielded no 
information on how much longer it 
might have endured. If the effect is 
analogous to the relief of pain by 
placebos in patients with = surgical 
wounds, we should expect it eventu- 
ally to diminish. Lasagna et al. (19) 


as great 
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Fic. 1. Errects oF MEPHENESIN AND 
PLACEBO ON SYMPTOMATIC DISTRESS OVER AN 
8-WEEK PERIOD. 

Total patients =17. At the 2-, 4-, 6-, and 
8-week intervals, V for placebo =11, 6, 10, 
and 7 respectively, while V for mephenesin 
=6, 11, 7, and 10 respectively For the 2- 
and 4-week periods, the dosage of mephenesin 
was 3 gms. per day; for the 6- 
periods, 9 gms. per day. 


and 8-week 


found that as placebo therapy of such 
patients continued the relief 
verienced decreased. 

Although the number of patients 
is too small to justify any conclusions, 
it is intriguing that the first dose of 
mephenesin seemed to counteract the 
placebo effect. In the study with 
reserpine (22), the only patients who 
failed to show a placebo effect were 
those who had received reserpine pre- 
viously. It may be that any discom- 
fort produced by a  pharmacologi- 
cally active agent tends to counteract 
the emotional state responsible for a 
placebo effect in susceptible patients. 
Analogously, an activity by the psv- 
chotherapist which disturbs the pa- 
tient may counteract 
the placebo effect of psychotherapy 
with certain patients. 

It would also be helpful to know it 
patients could be differentiated ac 
cording to attributes which predis 
posed them to a positive or negative 
placebo effect. If patients who im- 
proved with a particular form of psv- 
chotherapy were all known to be 


eX- 


conceivably 


~~--MEPHENESIN 
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positive placebo reactors, then the 
improvement could not be attributed 
to the specific form of treatment. If, 
however, they were known not to be 
positive placebo reactors, then any 
demonstrated improvement would 
constitute evidence of efficacy specific 
to the form of psychotherapy. 

There is little known, however, 
with regard to the attributes of 
placebo reactors. Lasagna et al. (19) 
have made the first attempts to in- 
vestigate this problem and_ report 
some attitudes and Rorschach cate- 
gories which differentiated their re- 
actors (N=11) from their nonre- 
actors (V=16). However, only 14 
per cent of their patients were con- 
sistent reactors, 1.e., showed the ef- 
fect with every placebo dose, and 31 
per cent were consistent nonreactors, 
while 55 per cent showed the effect 
on some occasions but not on others. 
This contrasts with the findings of 
Jellinck (18) whose patients with 
headache were, for the most part, 


either in the always-relieved group o1 
the never-relieved group, with only a 
small percentage of patients showing 


inconsistency of response. The ap- 
parent contradiction in findings may 
perhaps result from the difference in 
the cause of the pain in the two series 
or from other factors. In any case it 
indicates that the problem is a com- 
plex one needing much more study. 

In the light of these considerations, 
any method of demonstrating the 
specificity of response to a given tvpe 
of psychotherapy would have to pro- 
vide an adequate control design. As 
far as we know, the study which has 
paid closest attention to the question 
of controls in research in psychother- 
apy is that of Rogers and his col 
leagues (31). They emploved two dif- 
ferent kinds of control groups. One 
was a group of nonclients who were 
simply given a battery of tests before 
and after specified time periods. The 
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other was a group of clients who were 
required to wait a specified period of 
time before beginning therapy. This 
group was tested at the beginning 
and end of the wait period, at the end 
of therapy, and after a follow-up pe- 
riod. 

These procedures do not control 
for the placebo effect since neither 
control group was being subjected 
to any special procedures which could 
produce a reasonable expectancy in 
control subjects that certain changes 
should The experimental 
group, however, could be expected 
to anticipate certain effects merely as 
a consequence of participating in the 
chient-therapist interviews. There- 
fore, even though favorable changes 
could be demonstrated in their cli- 
ents, the question of whether these 
be 
answered from such research design 
additional 
provided. 


occur. 


were placebo effects could not 


inless information were 

If we do not control for nonsper ific 
factors like the placebo effect, we 
cannot know” whether pre- 
dicted from a theory lead to or result 
from improvement based on the non- 
specific effect. Butler and Haigh (2 
for example, report an increased cor- 
relation of perceived self with ideal 
self following chient-centered therapy. 
The implicit inference is that the 
specitic therapeutic method leads to 
this increased correlation which, in 
turn, contributes to amelioration of 
disabilitv and distress. 

It is conceivable, though, that as a 
result of a nonspecific placebo effect 
the client feels less disabled and dis- 
tressed which, in turn, leads him to 
describe himself as more like his ideal 
self. Rogers’ (30) findings of greater 
emotional maturity in_ successfully 
treated may be similarly ex- 
plained, clients feeling less disabled 
and distressed due to a nonspeciti 


effects 


cases 


placebo response and behaving con 
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sequently in ways which are less anx- 
iety-determined and which are seen 
as more mature by others. 

We would propose that the follow- 
ing conditions are optimal in planning 
research in psychotherapy: 

1. Atheory of personality and psy- 
chological distress (neurosis, malad- 
justment, etc.). 

2. Predictions of effects in the pa- 
tient or client consequent to psycho- 
therapy, in accord with the theory. 

3. Demonstration of a relationship 
between the predicted effects and 
some criterion of improvement. 

4. Demonstration that the pre- 
dicted effects and their relationship 
to the improvement criterion are not 
due primarily to the patient's convic- 
tion that therapy will help him. This 
will permit greater confidence that 
the relationship found is specific to 
the therapeutic 
from the theory. 

Ideally, these should 
obtain both for process.and outcome 


technique derived 


conditions 


research. There seems to be general 
agreement with regard to the first 
two conditions although Mackinnon 
21) has about 
beginning with a theory rather than 
a hunch. Gordon et al. (12) have 
come to question the third condition, 
at least with respect to a “global” 
criterion of improvement. 

The fourth condition has not been 
met in any research of which we are 
aware. It is not possible to set up an 
experiment precisely analogous to 
comparison medication with a 
placebo because there is no such thing 
as inert psychotherapy in the sense 
that placebos are pharmacologically 
inert. However, it may be possible to 


reservations 


some 


of a 


study the possible specific effects of 
any particular form of therapy by the 


use of a matched control group par- 
ticipating in an activity regarded as 
therapeutically inert from the stand 
point of the theory of the therapy 
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being studied. That is, it would not 
be expected to produce the effects 
predicted by the theory. The 
“placebo psychotherapy” in this 
sense would be analogous to placebos 
in that it would be administered 
under circumstances and by persons 
such that the patients would expect 
to be helped by it. 

Let us say that our theory is psy- 
choanalytic and our predicted effect 
is an increased correlation between 
the moral values of the patient and 
the therapist (superego identifica- 
tion) and that we also expect an as- 
sociation between the increased cor- 
relation and ‘a criterion of improve- 
ment (32). According to the theory, 
there is no reason to believe that con- 
trol patients receiving, for example, 
relaxation therapy (17) will show the 
increased correlation of moral values 
with their therapist’s moral values, 
nor should they show as much or as 
lasting improvement as the patients 
receiving psychoanalytic therapy of 
equal length. Such a design would 
constitute a fair test of the hypothesis 
based on the theory. In comparative 
studies where one type of psychother- 
apy is tested against another, dif- 
ferences found between them in pre- 
dicted effects or amount, nature, and 
duration of improvement would not 
be explainable as placebo effects, if 
the condition could be met that pa- 
tients had equal faith in the efficacy 
of the therapies and therapists to 
which they are assigned. 


SUMMARY AND CONCLUSIONS 


The literature on the therapeutic 
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efficacy of drugs compared with 
placebos is briefly reviewed, and its 
relevance for research in psychother- 
apy considered. It is concluded that 
improvement under a special form 
of psychotherapy cannot be taken as 
evidence for: (a) correctness of the 
theory on which it is based; or (6) 
efficacy of the specific technique used, 
unless improvement can be shown to 
be greater than or qualitatively dif- 
ferent from that produced by the 
patients’ faith in the efficacy of the 
therapist and his technique—‘‘the 
placebo effect.” This effect mav be 
thought of as a nonspecific form of 
psychotherapy and it may be quite 
powerful in that it may produce end- 
organ changes and relief from distress 
of considerable duration. 

To show that a specific form of 
psychotherapy based on a theory ot 
personality and neurosis produces 
results not attributable to the non- 
specific placebo effect it is not suft- 
cient to compare its results with 
changes in patients receiving no 
treatment. The only adequate con- 
trol would be another form of therapy 
in which patients had equal faith, so 
that the placebo effect operated 
equally in both, but which would not 
be expected by the theory of therapy 
being studied to produce the same ef- 
fects. We need to learn more about 
the nature of the placebo effect, the 
conditions giving rise to it, and the 
attributes of patients most suscepti- 
ble or resistant to it so that we may 
obtain a better understanding of the 
role of nonspecific factors in psycho- 
therapy. 
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In recent years a number of studies 
involving human Ss have been de- 
voted to testing the implications of 
certain Hullian concerning 
the relationship between performance 
in learning situations and level of 
total effective drive (D). In these in- 
vestigations drive has been defined in 


notions 


terms of scores on a manifest anxiety 
scale (41). 
experimental 


In view of the growing 


literature concerning 


these hypotheses since their initial 
statement by Taylor (40) and Taylor 
and Spence (43), an attempt to out- 
line the theory as it is presently con- 
ceived by the Iowa group and to 


evaluate the evidence concerning it 
seems to be in order. 

Before proceeding with these mat- 
ters, however, certain misunderstand- 
ings which have arisen concerning the 
purpose of this work should be men- 
tioned. First, although groups have 
been selected exclusively on the basis 
the Manifest Anxiety 
Scale (hereafter designated as MAS) 
the interest of the lowa group has 


of scores on 


not been in investigating anxiety as 
a phenomenon, but rather in the role 
of drive in certain learning situations. 
The assumption has been made that 
anxiety scores are related in some 
manner to drive level, but in terms 
of the major theoretical interests of 
this group, any other acceptable spec- 
ification of drive (eg., hunger) could 
be used in experimental tests of the 
hypotheses about the effect of drive 
level. Further, as Farber (6) has 
pointed out, no attempt has ever been 
made to claim that the only difference 
between individuals receiving differ- 
ent the MAS is in drive 


scores on 


level or that all performance differ- 
ences could be explained by drive. 
Undoubtedly there are many char- 
acteristics other than drive level on 
which anxious and nonanxious Ss 
differ; the investigation of these addi- 
tional properties of anxiety groups 
and their influence on performance is 
certainly both legitimate and impor- 
tant, but it simply has not been the 
interest of the proponents of the drive 
theory. 

A second point that should be clari- 
fied has to do with the MAS. The 
construction of the test was not 
aimed at developing a clinically useful 
test which would diagnose anxiety, 
but rather was designed solely to se- 
lect Ss differing in general drive level. 
Thus the question of the scale’s 
“validity” (.e., its agreement with 
clinical judgments) is in a sense irrele- 
vant to the experimental purposes for 
which the test was developed. In 
light of this, the test might better 
have been given a more noncommittal 
label, such as a measure of emotion- 
ality, although the fact that the items 
on the scale were selected by clini- 
cians as referring to manifest anxiety 
as it is described psychiatrically does 
not make the title completely inap- 
propriate nor a relationship between 
clinical judgments and MAS scores 
unexpected. Certainly the generality 
of the experimental findings with the 
MAS would be increased if correla- 
tions were found with other defini- 
tions and such attempts will be dis- 
cussed in a later section. However, 
regardless of the results of such stud- 
ies, it should be clearly understood 
that ‘“‘manifest anxiety”’ has been de- 
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fined operationally only in terms of 
test scores and will be so employed, 
unless otherwise indicated, in the 
present paper. 
Drive THEORY 

As stated earlier, the purpose of 
the lowa group has been to investi- 
gate the effects of varying drive level 
on performance in learning. situa- 
tions. Actual experimentation has 
involved two independent problems: 
(a) specification of the conditions un- 
der which drive differences are said 
to appear, and (6) the theory con- 
cerning the effects of drive level on 
behavior once drive has been aroused. 
The first problem concerns the pos- 
tulated relationship between the MAS 
and drive level, the second between 
drive (or anxiety) level and perform- 
ance in various situations. Since the 


two are separate matters, an outline 
of the theory concerning the influence 
of drive will be given first and the 
hypothesized 


relationship between 
drive and MAS scores considered at 
a later point. 

According to Hull (15), all habits 
(JI) activated in a given situation 
combine multiplicatively with the 
total effective drive state (D) operat- 
ing at the moment to form excitatory 
potential F[E=f{(41XD)}. Total ef- 
fective drive, in the Hullian system, 
is determined by the summation of 
all extant need states, primary and 
secondary, irrespective of their source 
and their relevancy to the type of 
reinforcement employed. 
sponse strength is determined in part 
by E, the implication of varying 
drive level in any situation in which 
a single habit is evoked is clear: the 
higher the drive, the greater the value 
of FE and hence of response strength. 
Thus in simple noncompetitional ex- 
perimental arrangements involving 
only a single habit tendency the per- 
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formance level of high-drive Ss should 
be greater than that for low-drive 
groups. 

Higher drive levels should not, 
however, always lead to superior per- 
formance (i.e., greater probability of 
the appearance the correct 
sponse). In situations in which a 
number of competing response tend- 
encies are evoked, only one of which 
is correct, the relative performance ot 
high and low drive groups will depend 
upon the number and comparative 
strengths the various 
Predictions concerning 
the performance of the groups in such 
complex tasks involve the introduc- 
tion of additional Hullian concepts: 
oscillatory inhibition (O) and thresh- 
old (L). 

The concept of O was introduced 
by Hull (15) in an attempt to allow 
for statement, within his system, of 
the intra-individual variability in 
behavior that occurs, presumably, 
because of uncontrolled variations 
from instant to instant within the 
organism and in his environment. 
The value of O is said to vary from 
moment to moment, the distribution 
of O values for a group of (like) indi- 
viduals on any trial forming a normal 
probability function. O is further as- 
sumed to play an inhibitory role, its 
value being subtracted from excita- 
tory potential (4), thus yielding 
momentary excitatory potential (£). 
In order for # to activate a response, 
it must attain a minimum or thresh- 
old value (L), a value that is pre- 
sumably the same for all similar habit 
tendencies evoked in a given situa- 
tion. Thus R=f(£) ={(£-O-L). 

In any task in which a stimulus 
tends to evoke a number of compet- 
ing responses the response that will 
appear on a given occasion will be the 
one with the highest suprathreshold 
momentary excitatory strength (2) 


of re- 


ot response 


tendencies. 
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at that moment. Other things being 
equal, of course, the response with 
the greatest /7 and hence E value will 
have a greater probability of occur- 
ring than any other response. 
Adding the notion of differing drive 
level to this conception, we see that 
the probability of appearance of the 
correct response involves an interac- 
tion between drive level and the num- 
ber and comparative strengths of the 
correct and tendencies. 
When the correct response is weaker 


incorrect 


(i.e., has less #7) than one or more of 
the competing response tendencies, 
high-drive groups should be inferior 
in performance to low-drive Ss. That 
is, because of the multiplicative rela- 
tionship between habit strength and 
drive, the stronger incorrect tenden- 
cies gain relatively more E than the 
correct tendency in the case of high 
drive Ss than in low drive, thus lead- 
ing to a greater probability of occur- 
rence of one of the stronger incorrect 
the high-drive 
Further, the possibility 


responses in group. 


exists that 


under a high-drive level new compet- 
ing responses with very weak habit 


strengths may be brought over the 
threshold walue of - with the conse- 
quence that the probability of occur- 
rence of the correct response is low- 
ered relative to that 
condition. 

At the other extreme, the correct 
response tendency may be highest in 
the hierarchy and relatively strong 
when compared to the incorrect. In 
such a situation, which is comparable 
to the case in which but a single habit 
is aroused, the / value for the correct 
response would be relatively greater 
than the other responses in the hier- 
archy for the high-drive group than 
for the low-drive, leading to the pre- 
diction of the superiority of perform- 
ance of such subjects. 

It should 


in a low-drive 


be obvious, then, that 
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maximum inferiority of high-drive Ss 
would be expected when a large num- 
ber of competing tendencies are pres- 
ent and the correct tendency is both 
relatively weak and low in the hier- 
archy. As the strength of the correct 
tendency increases relative to the in- 
correct, high-drive groups should be- 
inferior and eventually 
superior in performance to low-drive 
groups. The exact point of equality 
would be difficult to specify. Even 
when the correct response is highest 
(though not strongly dominant) in 
the hierarchy, high-drive Ss could 
still conceivably be inferior in some 
instances since a greater number of 
suprathreshold tendencies could more 


come less 


than offset the advantage of the rela- 
tively higher /& value of the correct 
response for these individuals.! 

An important consideration that 
should be noted about making pre- 
dictions concerning the effect of drive 
level upon performance in actual ex- 
perimental situations is that a_ be- 
havioral the situation 
must have been made; only in experi- 
mental arrangements in which the re- 
sults, independent of drive level, per- 
mit statements in terms of competing 
S-R tendencies are deductions from 
the theory possible. While the ma- 
jority of investigations designed to 


analysis of 


In a recent review Child (3) incorrectly 
interpreted the theoretical analysis outlined 
above as involving the sudden introduction of 
© and L for the situation in which the correct 
response is highest in the hierarchy. These 
concepts are of course assumed to be operating 
in all situations, including the noncompeti- 
tional one in which but a single response 
is being evoked. No appeal was 
made to these constructs in the latter instance, 
however, since their inclusion would not affect 
the predictions. Mention might also be made 
of other constructs in the Hullian system 
ey., J, V, K, etc.): it has been assumed that 
these are of equal value for all drive groups 
and that a consideration of their values would 
not result in changing any prediction 


tendency 
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test implications of these derivations 
concerning drive level have utilized 
tasks for which analyses in S-R terms 
had already been made and found to 
be useful, occasionally an experiment 
appears in which the investigator at- 
tempts to evaluate the total theory 
by comparing groups on a task which 
is poorly understood (and for which 
little or no rationale is presented) or 
which clearly involves the introduc- 
tion of variables not included in the 
theory. The accumulation of empiri- 
cal evidence concerning the perform- 
ance of different groups in any situa- 
tion or attempts to incorporate addi- 
tional variables within any theoreti- 
cal framework are certainly to be en- 
couraged, but statements that the 
results of such studies refute or con- 
firm theoretical expectations are un- 
warranted when there is no evidence 
that the boundary conditions im- 
posed by the theory are met. 
ANXIETY 

The use of the MAS to select 
groups that are postulated to differ 
in drive level in an experimental situ- 
ation has rested on the assumption 
that scores on the scale are in some 
manner related to emotional respon- 
siveness, which, in turn, contributes 
to drive level. Two alternative hy- 
potheses have been entertained con- 
cerning the conditions under which 
emotionality is evoked. One is that 
test scores reflect differences in a 
chronic emotional state so that in- 
dividuals scoring high on the scale 
tend to bring a higher level of emo- 
tionality or anxiety “in the door”’ 
with them than do Ss scoring at 
lower levels (40). A second alterna- 
tive conception is that MAS scores 
reflect different potentialities for anx- 
iety arousal, high scoring Ss being 
those who tend to react more emo- 
tionally and adapt less readily to 
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novel or threatening situations than 
do low scorers (28, 37). According 
to the first hypothesis differences 
among anxious and nonanxious groups 
(providing other conditions imposed 
by the theory are met) should be 
found whether or not there is any 
“threat,’’ in the form of noxious stim- 
ulation, fear of failure or the like, in 
the situation. Thus, for example, the 
performance of anxious Ss should be 
superior to the nonanxious in both 
classical defense conditioning, in which 
a noxious stimulus is employed, and 
in reward conditioning into which no 
objective threat has been introduced. 
In the case of the second conception, 
differences would be expected in the 
performance of anxiety groups only 
in those situations in which some 
threat is present. Should this be the 
conception, exact specifica- 
tion of the conditions thought to be 
sufficient to evoke anxiety would be 
necessary in order to test hypotheses 
concerning the role of drive. Avail- 
able evidence suggests that the mag- 


correct 


nitude of differences among groups 
mav be related to the level of noxious 


stimulation employed (37), or to 
stress-producing instructions (10,19), 
suggesting that differences in drive 
level among groups may depend at 
least in part upon situational factors. 
However, the picture is complicated 
by the results of a number of studies 
in which differences among anxiety 
groups have been found in the ab- 
noxious stimulation or in- 
structions designed to produce stress 
(8, 24, 25, 26, 42). 

Most investigators have not ex- 
plicitly considered this issue, assum- 
ing either that anxiety scores reflect 
a chronic level of emotionality or 
that factors are present in the typical 
laboratory experiment that result in 
different anxiety levels among groups. 
For purposes of evaluating those stu- 


sence of 
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dies in which degree of stress has not 
been under investigation, the 
sumption will tentatively be made 
here that in all situations, individuals 
scoring high and low on the anxiety 
will differ in level, for 
whatever reason. The evidence more 
directly concerned with the condi- 
tions of anxiety-arousal will be con- 
sidered at a later point. 


as- 


scale drive 


EXPERIMENTAL EVIDENCE 


Classical conditioning. Classical 
conditioning is said to be a noncom- 
petitional situation in which but a 
single response tendency is being ac- 
quired; theoretical expectation there- 
fore is that anxious groups will per- 
form at a higher level than nonanx- 
ious. The results of a number of stud- 
evelid using 
with extreme on the 
MAS? have upheld these predictions, 
anxious Ss showing a greater number 
of CR’s than nonanxious (11, 35, 37, 
38, 39, 40). In all cases but one (11), 


these differences were statistically sig- 


ies of conditioning 


groups scores 


nificant, the exception involving the 
use of only 10 Ss per group, consider- 
ably 
other investigations. 


fewer than were emploved in 
Data from eve- 


lid conditioning studies performed in 
the lowa laboratories and elsewhere 
(39 
ing throughout 
anxiety scores rather than only at the 


are also available from Ss scor- 


the entire range of 
two extremes. The relationship be- 
tween anxiety and conditioning scores 
has been uniformly found to be mono- 
tonic although not always linear, 
middle-anxiety Ss tending to show a 


2 In almost all of the studies involving the 
MAS, a comparison has been made of extreme 
scorers, typically the 20th percentile or below 
(nonanxious) and 80th percentile or above 
(anxious) in terms of a standardization group 
of college students (41). Use of the terms 
‘anxious’ and ‘‘nonanxious’ groups here 
should be understood to refer to such ex- 
tremes unless otherwise indicated. 


MANIFEST 


ANXIETY 307 
performance level closer to the low- 
scoring than the high-scoring groups. 
The magnitudes of the correlation 
coefficients obtained have been in the 
neighborhood of .25, thus indicating 
that relatively little of the variance 
among Ss can be accounted for in 
terms of anxietv scores. In view of 
the low correlation and the mono- 
tonic relationship between the two 
variables, continued use of extreme 
groups only for research purposes in 
such situations seems justified. 

A conditioning study employing a 
response other than the eyeblink has 
also been reported in the literature. 
An investigation by Bitterman and 
Holtzman (1) utilized the PGR tech- 
nique which, like the eyelid situation 
it will be noted, involves defense con- 
ditioning. After dividing a group of 
randomly selected college students 
into the upper and lower 50% on the 
basis of MAS scores, these investiga- 
tors found a slight but statistically 
insignificant superiority in condition- 
ing level on the part of their anxious 
Ss. Since their anxious group in- 
cluded individuals with scores con- 
siderably lower than those in the in- 
vestigations referred to above, this 
lack of statistical significance is not 
too surprising, 

Several studies are available con- 
cerning differential conditioning, also 
in the eyelid situation (11, 34, 36). 
The predictions derived from the 
theory in this instance are that anx- 
ious Ss should exhibit a greater excita- 
tory strength both to the positive 
(reinforced) CS and to the negative 
(nonreinforced) CS and further, that 
the difference in excitatory strengths 
of the two stimuli should be greater 
for the anxious group. By transform- 
ing all raw data into excitatory 
strength values, Spence and his col- 
leagues (34, 36) have attempted to 
test these predictions in some five 
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separate instances. In each case, the 
excitatory strength to the positive CS 
during differential conditioning was 
significantly greater for anxious Ss, 
as was expected. The results con- 
cerning the remaining two predictions 
were not so clear-cut. In four out of 
five independent instances the excita- 
tory strength to the negative stimulus 
Was greater for the anxious Ss but in 
no case was the difference significant. 
In all tive cases the difference be- 
tween excitatory strengths was in the 
expected direction but was significant 
in only one instance. While the re- 
sults of these studies tend to lend 
some support to the theory, some- 
what contradictory findings have 
been reported by Hilgard, Jones, and 
Kaplan (11). As mentioned earlier, 
contrary to other studies of simple 
evelid conditioning, these investiga- 
tors found only a slight, statistically 
insignificant superiority for anxious 
Ss during training to the positive CS. 


During differential conditioning, the 
anxious group continued to exhibit 


an insignificant superiority to the 
nonanxious on the positive CS. How- 
ever, the responses of the anxious Ss 
to the negative CS were significantly 
greater as would be expected by drive 
theory. 

Stimulus generalization. Stimulus 
generalization, to which differential 
conditioning is related, has been in- 
vestigated more directly by Rosen- 
baum (28) and Wenar (48). Rosen- 
baum found greater responsiveness to 
generalized stimuli in a spatial situa- 
tion for an anxious group than for a 
nonanxious group, as would be pre- 
dicted by drive theory, but only in 
the case of Ss given strong intermit- 
tent shock during their performance; 
for groups given a weak shock or 
buzzer, no_ significant differences 
emerged. After training groups of 
anxious and nonanxious Ss on a key- 
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pressing response to a strong shock, 
weak shock or a buzzer presented at 
regular intervals, Wenar (48) meas- 
ured the reaction time to these stim- 
uli in a test series in which the inter- 
vals of presentation were longer or 
shorter (temporal generalization) 
than those employed during training. 
Reaction time was related signifi- 
cantly to both stimulus intensity and 
anxiety level, response time being 
quicker as these variables increased. 

Maze learning. The first study to 
be concerned with demonstrating 
that the relative performance of anx- 
ious and nonanxious Ss is a function 
of degree of interference within a 
task was reported by Taylor and 
Spence (43), who used a type of serial 
verbal maze. On the assumption 
that errors in such a situation are 
largely the result of interfering re- 
sponse tendencies, due to remote as- 
sociations, etc., it was expected that 
anxious Ss would make more errors 
and take more trials to reach a cri- 
terion than nonanxious. The results 
of this study and of a subsequent in- 
vestigation by Farber and Spence 
(8) with a stvlus maze have confirmed 
these hypotheses, the greater number 
of errors and trials to criterion being 
made by the anxious groups. An ad- 
ditional prediction was also made for 
these maze data, namely that the de- 
gree of inferiority of the anxious Ss 
in comparison to the nonanxious 
should be positively related to difh- 
culty of the choice point. In both 
studies, significant rank-order_ cor- 
relations were obtained between the 
difference in number of errors be- 
tween groups on an individual choice 
point and the difficulty of that point. 
Although these results tend to con- 
firm theoretical expectation, some 
discrepancy between prediction and 
the experimental findings occurred 
on the easiest choice points. In each 
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investigation, the small number of 
errors on the three 
points suggests the presence of few 
interfering that the 
anxious might be expected to be su- 
perior in performance. Even 


easiest two or 


tendencies 


SO 


here, 


however, they tended to be inferior. 
In addition to the two studies uti- 
lizing extreme groups, one study of 


stylus maze learning involving the 
entire of anxiety has 
been reported. After splitting a ran- 
domly selected group of college stu- 
dents into 7 anxiety groups according 
to their MAS scores, Matarazzo et al. 
(24) found a linear’ relationship 
(r=.25) between anxiety and trials 
to the criterion on the maze. 

While the investigations reported 
above have found differences between 
anxiety groups on maze performance, 
Hughes, Sprague, and Bendig (14), 
utilizing extreme groups, failed to 
duplicate these results with several 
serial verbal mazes. Different from 
the Taylor and Spence study in which 


range scores 


the typical 2-second rate of stimulus 
presentation was employed, Hughes 
et al. used a 4-second rate in all cases. 
Previous investigations have demon- 
strated (12) that performance is posi- 
tively related to the interstimulus in- 
terval in serial learning but since the 
effects of this variable are poorly un- 
derstood, the implications of the fail- 
ure to find differences between anx- 
lety groups with the 4-second condi- 
tion are not clear. One possibility, 
based on the assumption that differ- 
anxiety level are largel\ 
determined by situational factors, is 
that under longer time intervals, 
stress upon Ss, and hence upon dif- 
ferences in emotionality between 
anxious and nonanxious, is mini- 
mized. 

Verbal learning. Rather than at- 
tempting to demonstrate an interac- 
tion between anxietv level and degree 


ences in 
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of interference by examining individ- 
ual items within a single task, as was 
done in the maze studies, Montague 
(25) tormed three different lists of 
serial nonsense syllables which, be- 
cause of varying degrees of formal 
intralist similarity and = association 
value of the syllables, presumably dif- 
tered in the amount of intralist inter- 
A significant interaction was 
found between anxiety and list, an 
anxious group being significantly su- 
perior in performance to nonanxious 
on the list for which similarity was 
low and association value high, and 
the position being reversed for groups 
given a list of high similarity and low 
value. Similar findings 
have been reported by Lucas (19) in 
a study in which Ss were asked to re- 
call lists of consonants read to them. 
As the number of duplicated conso- 
nants within a list was increased, anx- 
ious Ss showed a significant decrease 
in the amount recalled while the per- 
formance of the nonanxious was not 
affected. 

While a number of investigators 
have emploved serial learning tasks, 
from the point of view of testing the 
implications of drive theory, the 
paired-associate technique seems to 
be preferable. Whereas intralist in- 
terterences due to such factors as re- 
mote associations are inherently part 
of serial learning and are thus difficult 
to manipulate, the use of discrete 
S-R pairs permits more precise control 
of the number and strength of the 
response tendencies elicited by each 
stimulus. Turning to the investiga- 
tions that have emploved this paired- 
associate arrangement, several stud- 
ies have attempted to minimize the 
presence of competing response tend- 
thus to demonstrate the 
performance superiority of anxious 
Ss. In one, Taylor and Chapman 
(42) chose nonsense svllables with 


ference. 


association 


encies and 
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low formal similarity, in an attempt 
to provide a noncompetitional ar- 
rangement in which each stimulus 
tended to evoke only its own re- 
sponse. As expected, on two lists for 
which such low similarity obtained, 
anxious Ss were significantly superior 
in performance to nonanxious. Simi- 
lar superiority of anxious Ss has been 
reported by Spence (33) on an adjec- 
tive list in which the association be- 
tween each S-R pair was presumed to 
be initially strong and minimum sim- 
ilarity existed among pairs. In a sec- 
ond part of this investigation, an at- 
tempt was made to maximize the 
number of competing tendencies by 
having a high degree of svnonymity 
among stimuli. As predicted, an anx- 
ious group in this case was inferior. 

The initial strength of association 
between S-R was also manipulated 
by Ramond (26) in an investigation 
involving a variation of the standard 
paired-associate technique. Each 


stimulus, an adjective, had connected 


with it two words, one 
judged to be highly associated with 
the stimulus and the other with no 
discernible association. Each type of 
response was correct for half of the 
items. When the low association re- 
sponses were correct, anxious Ss were 
expected to perform at a lower level 
than nonanxious because of the 
greater interference of the strong, in- 
correct response for this group. The 
results confirmed this prediction. 
Theoretical expectations for the situ- 
ation in which the stronger response 
was correct are not so clear-cut since 
the arrangement of the list made it 
likely that as learning took place the 
low association responses would inter- 
fere occasionally with the high asso- 


response 


ciation response because of stimulus - 


generalization. Thus, while anxious 
Ss might be expected to be superior 
early in learning, they might lose this 
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superiority as the weak responses are 
learned and provide competition. 
The results lent some support to these 
expectations, anxious S first being 
superior and then inferior to nonanx- 
ious although the over-all difference 
between groups did not reach statisti- 
cal significance. 


ANXIETY SCORES AND THEIR 
RELATIONSHIP TO STRESS 


As was indicated earlier, two alter- 
native hypotheses have been enter- 
tained concerning the difference be- 
tween Ss scoring high and low on the 
MAS with respect to anxiety: that 
such groups have different levels of 
chronic anxiety or that the groups 
instead differ in their emotional 
reactiveness to anxiety-evoking stim- 
uli present in a situation. 

The studies of verbal learning just 
discussed indicate that whether due 
to chronic or situational factors, dif- 
ferences between high and low scor- 
ing Ss cannot be said to be produced 
only when stress is deliberately in- 
troduced into the situation, either by 
means of noxious stimulation as in 
the case of defense conditioning or by 
the administration of stress-provok- 
ing instructions (e.g., reports of fail- 
ure). Consideration of the studies 
into which some threatening stimula- 
tion has been introduced may, how- 
ever, throw some light onto the ques- 
tion as to whether differences in anx- 
iety among groups could depend, at 
least in part, on situational variables. 

Should situational factors play a 
role in determining differences in 
emotionality among anxiety groups, 
the strength of the UCS in classical 
conditioning might be expected to be 
related to such group differences. A 
comparison of three experiments ot 


~ eyelid conditioning from the Iowa 


laboratory involving a_ relatively 
strong, medium, and mild UCS, re- 
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spectively, was made by Spence and 
Farber (35). Examination of the 
mean conditioning scores reveals that 
while intensity of the UCS tended to 
be related to performance, the magni- 
tude of the difference between anx- 
ious and nonanxious remained rela- 
tively under the different 
intensities. Different results were ob- 
tained by Spence and his associates 
(37) in a study specifically under- 
taken to evaluate the effect of the 
strength of noxious stimulation on 
anxiety groups. 


constant 


In this investigation 
the Ss, selected without reference to 
their anxietv scores, were conditioned 
with a relatively weak UCS, but one 
group was given occasional electric 
shocks between trials, another threat- 
ened with shock, and a third trained 
under neutral conditions. These lat- 
ter Ss, run under neutral conditions, 
gave fewer CR’s than the other 
groups, especially in earlier trials 
When Ss were later divided into the 
upper and lower 50 per cent accord- 
ing tO anxiety scores, it 
that while the high-scoring group 
conditioned without shock or threat 
of shock exhibited only a slight, sta- 


was found 


tistically insignificant superiority in 
conditioning performance, the differ- 
between anxiety groups 
highly significant for Ss with whom 
shock or threat of shock 
ploved. 


ence was 


was em- 

The previously mentioned studies 
of stimulus generalization by RKosen- 
baum (28) and Wenar (48) were also 
concerned with variations in the in- 
tensity of noxious stimulation, in 
both cases a buzzer and two intensi- 
ties of shock being emploved. While 
Rosenbaum found a significant dif- 
ference .between groups only when 
strong shock was used, Wenar’s re- 
sults (with a somewhat different ex- 
perimental arrangement) indicated a 
greater responsiveness for the anx- 
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ious group under all three conditions. 
Furthermore, the magnitude of the 
difference between groups was unaf- 
fected by stimulus intensity. 

Turning to verbal learning, Deese, 
Lazarus, and Keenan (4) have re- 
ported a study in which the effect of 
electric shock on serial learning was 
investigated. Here it was found that 
nonanxious groups given intermittent 
shocks performed at a significantly 
lower level than a nonanxious control 
group run under neutral conditions. 
In contrast, the performance of the 
anxious groups remained relatively 
constant, Ss run under shock not dif- 
fering from their control group. Fur- 
ther, when all conditions were com- 
bined, the performance of the anxious 
was significantly superior to the non- 
anxious.* Thus, while the differences 
between groups increased under 
shock, they were due to the disrup- 
tive effect of the shock on the non- 
anxious Ss. 

Quite in contrast to the results of 


Deese et al. are the findings obtained 


\lthough, presumably, the serial list was 
of relatively low intralist similarity, it is 
difficult to tell from the writers’ description 
what drive would have predicted 
concerning the performance of the anxiety 
groups, independent of the stress factor. Ina 
parallel, experiment involving a 
dificult list (12 consonant syllables 
composed of only 5 consonants) presented for 
a standard 12 trials, Lazarus, 
Hamilton (17 
groups either as a function of anxiety scores 
or of shock-no-shock conditions. While 
these results appear superficially to be con- 
tradictory both to drive theory (which would 
expect inferiority of anxious Ss) and to the 
results of the first study with respect to the 
influence of shock, inspection of their data 
indicates that all groups averaged only about 
one correct response per trial. Since so little 
learning took place it is not surprising to have 
no differences in performance among groups. 
For this reason it is felt that the study does 
not provide very meaningful evidence on the 
effects of either anxiety level or shock on task 
performance. 


theory 


second, 


more 


Deese, and 
found no differences among 
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by Gordon and Berlyne (10) in an 
investigation of verbal learning utiliz- 
ing psychological stress rather than 
noxious stimulation. After being told 
that the tasks were measures of intel- 
ligence and that their performance 
on a paired-associate list was above 
average, anxious and  nonanxious 
groups did not differ significantly in 
amount of negative transfer on a sec- 
ond paired-associate list. An anxious 
group told that their first list per- 
formance how- 
ever, exhibited significantly more 
negative transfer than did a compara- 
ble nonanxious group. — Finally, in 
the Lucas study (19) mentioned ear- 
lier in which the recall of consonants 
lists varying in number of duplica- 
tions was investigated, the effects ot 
varying numbers of reports of failure 
to meet expected standards were also 
studied. While nonanxious Ss in- 
creased the amount recalled with 
greater numbers of failure experi- 
ences, the anxious groups did signifi- 
cantly worse. 

As mav be seen, the available evi- 
dence does not present a clear-cut 
picture with respect to the effects of 
stress. Summarizing first those in- 
vestigations involving noxious stimu- 
lation, the results indicate that with 
one exception (4) the performance of 
all Ss tends to be affected in the same 
direction as is found with an increase 
in anxiety (MAS) level. The magni- 
tude of the difference between anx- 
ious and Ss 


was below average, 


nonanxious either re- 


mains constant with greater degrees 


of stimulation or is increased. The 
data from the two studies employing 
psychological stress (in both cases 
defined by telling S he had failed to 
achieve adequate standards on an 
intelligence test) have revealed some- 
what different relationships. In both 
instances (10, 19) the performance 
of anxious Ss under stress was sig- 
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nificantly worse than the anxious 
group under neutral condi- 


tions while the performance of non- 


tested 


anxious Ss was in one case the same 
and in the second better than the 
control group. Thus, the magnitude 
of the difference between anxiety 
groups Was greater under stress than 
under neutral conditions. 

The available evidence suggests 
then that situational sources of stress 
may play a role in determining the 
difference in anxiety level between Ss 
scoring at the extremes of the MAS. 
Whether the differences between 
groups in the verbal learning studies 
into which no objective stress had 
been introduced by the experimenter 
reflect chronic anxiety level or uni- 
dentified sources of threat remains 
an open question. Speculating on 
this point, to many college soph- 
omores psychology experiments per 
se may be seen as somewhat threaten- 
ing, particularly when the task could 
be interpreted as reflecting on their 
personality or intelligence. It is per- 


‘fectly possible that in experimental 


arrangements involving no noxious 
stimulation or stress-inducing  in- 
structions which call upon skills not 
particularly valued by 
dents, differences 
might disappear. ' 

Using the results of these studies 
involving stress to attempt to deter- 
mine the source of anxiety differences 
between high- and low-scoring Ss or, 
for that matter, to test drive theory, 
involves the assumption that the only 
effect of stress in any situation is to 
increase drive level or, at least, that 


college stu- 


between groups 


4 A study of classical reward conditioning of 
the salivary response by Bindra, Paterson, 
and Strzelecki (On the relation between anx 
iety and conditioning, Canad. J. Psychol.,1955, 
9, 1-6) which appeared after this review was 
written confirms this suggestion. No difference 
was found between anxious and nonanxious 
groups 
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anxious and nonanxious groups do 
not respond differentially to stress 
except with respect to anxiety or 
drive. Although no systematic ex 
ploration has been made of the rela 
tionship between degree of noxious 
stimulation and performance on vari- 
ous types of tasks, an examination of 
the general literature concerning the 
effect of such stimulation in nonver- 
bal, noncompetition situations lends 
some credibility to this assumption 
(32). It is important to note that 
with one exception (4) the studies ot 
the effects of noxious stimuli on anx- 
ious and nonanxious Ss 
ploved tasks of this type. 

In contrast, the literature concern- 
ing studies of 


have em- 


psychological stress 
ego-involving instructions, re- 
ports of failure), most of which have 
employed quite complex tasks, 
that than or in 
addition to drive level are 
The variety of 
may 


{one 


sug- 
gests factors other 
involved. 
roles or effects that 
have in addition to the 
motivational one has been discussed 
by Deese, and Osler (18 


recently Farber (7 


stress 


Lazarus, 


and more by 


Particularly pertinent to the present 
discussion is the finding that 


there 
individual differences in 


response to such stress, some individ- 


are wide 


uals improving in performance, others 
decreasing, and still others being un- 
affected. The direction of the etfect 
of stress has further been related to 
several personality variables (18). 
The Ss scoring at the extremes of the 
MAS continuum may react to such 
with characteristically differ 

ent patterns as well. Thus, it is possi- 
ble that with increasing degrees ot 
differences 


stress 


between anxious 


and nonanxious other than drive may 


stress, 


be aroused and become responsible, 
at least in part, for the discrepancy 
between the performance levels ot 
such groups. 
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Unfortunately, the two available 
studies involving psychological stress 
do not permit an evaluation of this 
suggestion (nor of the possibility that 
stress of any type, physical or psycho- 
logical, mav have a similar effect in 
tasks of sufficient complexity). Both, 
it will be recalled, used learning tasks 
of such a type that an increase in 
drive level might be expected to re- 
sult in deterioration of performance. 
Thus, it could be argued that the anx- 
ious were “threatened” (had their 
drive level increased) by the stress 
instructions and hence deteriorated in 
performance in comparison to their 
eutral control group while the fact 
that the nonanxious under stress did 
not show a similar inferiority merely 
indicates that they were emotionally 
unaffected by 
The only hint that more might be in- 
volved than drive level is contained 
in the study in which non- 
anxious improved with a= greater 
number of failure experiences while 
the Such a 
further that these 
additional factors, if any, might act 
in the direction of interfering with the 
performance of anxious Ss and of 
facilitating the performance of non- 
anxious. Additional research 


the stress conditions. 


Lucas 


anxious became worse. 


finding sug 


gests 


upon 
the effects of stress on anxiety groups, 
particularly with tasks of different 
levels of complexity is certainly 
needed to provide information about 
these possibilities. 

The suggestion that at least psy- 
chological stress may have other than 
drive effects on anxious and nonanx- 
ious Ss in complex tasks bears some 
resemblance to the empirical predic- 
tions proposed by Sarason and Man- 
dler and their associates (22, 23, 29) 
for the performance of groups selected 
by a different measuring instrument, 
a questionnaire of “‘test anxiety,” 
designed to select individuals react- 
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ing with different degrees of anxiety 
to intelligence tests and course ex- 
aminations. These investigators hy- 
pothesized that such high-anxious 
individuals react to an experimental 
situation represented as a test of in- 
telligence or the like (thus, according 
to their conception, creating stress) 
not only with more anxiety or drive 
than low-anxious but also, as a result 
of past learning, have evoked by 
their anxiety irrelevant response tend- 
encies which interfere with task per- 
formance. Under increasing stress 
(such as reports of failure) the per- 
formance of high-anxious Ss worsens 
because of the arousal of a greater 
number of these irrelevant tenden- 
cies, offsetting the facilitating effects 
of drive; the performance of the low- 
anxious, however, improves” with 
greater stress due to an increasing 
drive level, unaccompanied by irrele- 
vant tendencies. Such a theory, al- 
though predicting the same results as 
would be expected from the notions 
being put forward here about the ef- 
fect of stress on the performance of 
anxious and nonanxious in complex 
tasks, differs from these suggestions 
in several ways. In contrast to drive 
theory, Sarason and Mandler seem 
to imply that other things being 
equal, heightened drive always results 
in raising performance, independent 
of the type of task involved. Further 
they propose that the effect of stress 
is to evoke certain disruptive re- 
sponse patterns in addition to drive 
only for high-anxious Ss while the 
suggestion of the present writer is 
that additional factors may be elic- 
ited under stress for both anxiety ex- 
tremes although their effects on per- 
formance may be in the opposite 
direction. 

Although Sarason and his col- 
leagues have confined their interests 
to “test anxiety’’ and its effects, 
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primarily, on intelligence-test items 
under stressful conditions, Child (3) 
has proposed that all the work done 
with Ss scoring at the extremes on 
the MAS, independent of whether 
stress is introduced, could be more 
plausibly explained by such an inter- 
ference theory. These task-irrelevant 
responses are always present in anx- 
ious Ss, as well as a higher drive level, 
Child states, but they disrupt per- 
formance only in complex situations 
“where the subject is already in con- 
flict between various response tend- 
encies relevant to the task [so that] 
the presence of irrelevant response 
tendencies heightens the conflict and 
interferes with performance to a 
greater extent than increased drive 
improves it’’ (3, p. 154). 

It would appear to 


the present 


writer that a theory that attempts to 
attribute all inferiority of perform- 


ance to irrelevant tendencies would 
either be forced to predict that anx- 
ious Ss would always be. inferior to 
nonanxious in such complex tasks as 
verbal learning (since it seems hard 
to maintain that even with verbal 
materials having little intratask in- 
terference, irrelevant extratask re- 
sponses could not interfere with per- 
formance) or, if already obtained re- 
sults are to be explained, that anxiety 
level and its correlated irrelevant re- 
sponse tendencies would shift up and 
down abruptly from task to task and 
even from stimulus to stimulus within 
a task as the number of competing 
response tendencies directly elicited 
by a stimulus varied. Tieing the 
number of extratask responses to the 
number of intratask interferences 
would seem merely to be adding one 
more variable to those considered by 
drive theory without making differ- 
ent predictions in the situations to 
which drive theory has been thought 
to be applicable. 
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It is interesting to note that the 
suggestions being proposed here con 
cerning the possible role of response 
well as drive differences in the 
performance of anxious and nonanx- 
ious Ss in stress situations leads to a 
different prediction then do Child's 
hypotheses in certain cases. Accord- 
ing to the present writer, on verbal 
tasks in which anxious Ss are demon- 


as 


strated to be superior to nonanxious 
under neutral conditions, the intro- 
duction of stress might be expected 
to minimize this difference between 
groups or even to reverse its direc- 
tion, the performance of anxious 


Ss being lower than under neutral 
conditions and the nonanxious possi- 


bly being higher. Child, while per- 
haps also expecting nonanxious Ss to 
be better under stress than under 
neutral conditions, would be forced 
to predict that anxious group 
under stress would be the same as or 
even superior to its neutral control 
group rather than worse. That is, the 
fact that under neutral conditions the 
anxious Ss perform at a higher level 
than would indicate, 
according to Child, that this was a 
situation in which making irrelevant 
responses does not interfere with task 
performance, the difference between 


an 


nonanxious 


groups in favor of the anxious being 
due, then, to their higher drive. 
While stress might increase the drive 
level of Ss and hence the 
magnitude or number of the task- 
irrelevant these latter 
would still not compete with task- 
relevant responses since the task is 
the same. 

Still another interpretation of the 
relationship anxiety and 
stress has been suggested, the pre- 
dictions of which are quite opposed 
to any of those previously discussed. 
On the basis of their findings with 
serial learning that the performance 


anxious 


responses, 


between 
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of nonanxious groups deteriorated 
with shock while that for the anxious 
did not, Deese, Lazarus, and Keenan 
(4) suggested that the MAS measures 
not so much anxiety as how indi- 
viduals defend themselves against 
anxiety, and further, that MAS scores 
are related to the hvysteria-psychas- 
thenia continuum. The latter pro- 
posal arose from the finding that 
(with overlapping items excluded) 
there was a positive correlation of 
40 between the MAS and the Psv- 
chasthenia (Pt) scale on the MMPI 
and a —.23 correlation between the 
MAS and the Hysteria (//y) scale. 
By assuming that nonanxious Ss are 
hysterical individuals who are unable 
to maintain their defenses in the face 
of objective inescapable stress (e.g., 
shock, as opposed to psychological 
stress), and therefore are greatly dis- 
turbed by it while the anxious are 
psychasthenic and therefore react to 
objective threat coolly and intellectu- 
ally, they believe their results become 
intelligible. The same explanation 
has been offered by Eriksen (5), who 
found that Ss scoring high on the Hy 
scale exhibited more stimulus gen- 
eralization in an investigation involv- 
ing shock than did high Pt Ss. These 
results, Eriksen stated, were inex- 
plicable in terms of drive theory. In 
attempting to evaluate these hy- 
potheses (and leaving aside any ques- 
tions of the clinical validity of the 
various measures employed) it might 
be well to inject a historical note. In 
developing a scale for the selection of 
Ss, the present writer deliberately 
attempted to include items descrip- 
tive of overt or manifest anxiety and 
avoided including items describing 
behavior not itself ‘‘anxious’’ but 
said to be a defense against an in- 
ternal anxiety precisely because it 
was the purpose of the scale to select 
Ss differing in functioning anxiety 
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level in the experimental situation; 
to the extent that defenses were effec- 
tive in keeping anxiety at a mini- 
mum, inclusion of items” 
on the scale would have been 
defeating. 

The conflict between the hypothe- 
sis of Eriksen, Deese, et a/., and the 
assumptions made by drive theorists 
in using the MAS is not whether some 
individuals scoring low on the scale 
are potentially anxious individuals 
with good defenses, but rather 
whether the introduction of special 
conditions such as shock so affect a 
sufficient number of low scoring Ss 
as to wipe out or reverse the direc- 
tion of difference in drive or emo- 
tionality between low- and high-scor- 
ing groups that exists under neutral 
conditions. If Ss are thus affected, 
drive theorists must either abandon 
the MAS for a different selective in- 
strument, or restrict themselves to 
testing groups in situations in which 


“defense 


self- 


defenses are assumed to be operating. 


An examination of the available 
evidence suggests that no modifica- 
tion of the postulated relationship 
between anxiety and drive 
level needs be made at the present 
time (if it is understood that the pur- 
pose of drive theory is to investigate 
the effects of drive once in operation 
rather than the development of a 
comprehensive theory of anxiety as a 
personality phenomenon). That is, 
the results of Deese ef a/. (4) seem 
deviate; no other investigation in- 
volving noxious stimulation (since 
psychological stress does not assault 
hysterical defenses) has obtained re- 
sults that would be expected if the 
anxiety level of low scoring Ss in- 
creased up to or beyond that of the 
high scoring Ss. If such stimulation 
has any differential effect at all, it 
appears to be in the direction of in- 
creasing the anxiety of the anxious 


scores 
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group proportionately more than the 
nonanxious. Examining the Eriksen 
results and accepting them as reli- 
able, there seems to be no firm basis 
for suggesting that drive theory 
would have predicted more stimulus 
generalization for the high Pt group 
than the high J7y. Such a claim rests 
on the assumption that all nonanx- 
ious Ss would be low /7y and all anx- 
ious high Pt. The magnitude of the 
reported correlation coefficients, par- 
ticularly between the MAS and the 
hysteria scale does not make this 
assumption seem too reasonable. 
Even if high J/y Ss do become dis- 
turbed under nonescapable stress, a 
sufficient number of Ss could remain 
in the nonanxious group who were 
“genuinely” nonanxious, or whose de- 
fenses remained intact, to have a non- 
anxious group exhibit less stimulus 
generalization than the anxious. 
More relevant than such armchair 
argument, however, are Rosenbaum’s 
(28) results. Using an experimental 
arrangement very similar Erik- 
Rosenbaum found, it will be 
recalled, more stimulus generaliza- 
tion for anxious than nonanxious, 
and even more important, that the 
difference between groups was sig- 
nificant only under the conditions of 
strong shock. 


to 


sen's, 


MAS AND CLINICAL MEASURES 
OF ANXIETY 

As was indicated earlier, the mean- 
ing of the term “‘anxiety’’ as used in 
the studies attempting to determine 
the relationship between drive and 
performance has been only in terms 
of MAS scores. While such pure 
operationism is methodologically 
sound, the generality of these results 
would be considerably expanded were 
a relationship established between 
the MAS and more common clinical 
definitions of anxiety. Most valuable 


‘ 
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would seem to be 


with 


a comparison ot 


scale scores ol servers’ ratings 


of overt behavior since other diag- 


nostic tests of ANNICLY are themselves 
purported to be indicators of such 
behavior. Fortunately, several stud- 
ies relating MAS scores and obser- 
vational data have been carried out. 
In the first of these investigations, 
reported by Gleser and Ulett (9) of 
Washington University, a psychia- 
trist rated 151) normal 
and 40° psychiatric 
overt anxiety 


individuals 
patients with 
as a prominent symp- 
tom after an hour interview with each 
subject. Ratings were made on an 8- 
point scale of anxiety-proneness, de- 
fined as the tendency for overt anx- 
iety symptoms to appear ina stressful 
situation. For the total 
correlation between these 
and MAS scores was .61. Other simi- 
lar studies by the Washington group 
(45, 46) with more restricted samples 
indicated lower coefficients. In a 
study of 110 male students, involving 
the judgments of two psychiatrists, 
the ratings correlated .28 and .29 
with MAS scores for the two raters, 
while the interjudge reliabilitv was 
28 (46). All correlations were sig- 
nificant. Lastly the Washington 
group reported a coetticient of .40 be- 
tween the ratings of a single psychia- 
trist and anxiety scores for 141 nor- 
mal Ss (45). 

Operating in a student-counseling- 
center Hovt and 
(13) asked experienced counselors to 
rate their own clients (.V =289 
one of three groups: high, medium, 
or low manifest anxiety. Comparing 
the mean MAS scores for each of the 
resulting anxiety groups, an 
tremely significant chi square was 
found, while the contingency coefti- 
cient, used as an estimate of the x to be 
expected if the variable had been 
continuous, was .47. Using a still dif- 


group the 


ratings 


setting, Magoon 


into 


¢x-* 
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ferent criterion of clinical anxiety 
Kendall (16) had pairs of nurses rate 
TB patients on their ward on a 7- 
point rating scale for each of nine 
aspects of manifest anxiety. Selecting 
from the 93 patients so rated the up- 
per and lower 27% in terms of MAS 
scores, Kendall compared the differ- 
ence in mean over-all anxiety ratings 
for the two groups and found it to be 
statistically insignificant; taking only 
the upper and lower 13 the 
MAS, a very significant t between 
mean ratings was obtained. 

Finally, a study by Buss, Wiener, 
Durkee, and Baer (2) represents one 
of the few investigations utilizing 
hospitalized psychiatric patients. 
Keach of their 64 patients Was inter- 
viewed and then rated by four psy- 


on 


chologists on nine aspects of directly 
observed and reported anxiety. Cor- 
relations between judges’ pooled rat- 
ings and MAS scores ranged between 
.16 to .68 for these various aspects; 
the correlation with an over-all rating 
of anxiety was .60. 

The variation in the training of the 
raters, opportunity for observation, 
rating scales, and populations from 
which the subjects were drawn makes 
it difficult to formulate any statement 
about the ‘‘validity”’ of the MAS. To 
the extent that all of these observa- 
tional criteria themselves cor- 
related and are agreed to be clinically 
acceptable indices of manifest anx- 
ietv, there does seem to be some rela- 
tionship between MAS and observed 
behavior. These results suggest, 
then, that the experimental results 
obtained with the anxiety scale might 
also hold tor groups selected accord- 
ing to clinical criteria. Such studies 
as have been reported about the per- 
formance of clinically selected anx- 
ious groups on comparable tasks tend 
to confirm this suggestion (1, 20, 30, 
47). 


are 
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In addition to the experimental 
studies of the performance of anxious 
and nonanxious groups already dis- 
cussed, a number of other investiga- 
tions have reported differences in the 
behavior of anxious and nonanxious 
Ss, ranging from indications of num- 
ber of food aversions (31) to per- 
formance in problem-solving tasks 
(21, 49). The exclusion of these many 
experiments from consideration here, 
due to the limited purpose of this 
paper—that of assessing the evidence 
directly relevant to drive theory— 
points up what has not always been 
fully appreciated about this theory. 
It is an extremely restricted one, re- 
ferring only to the effects of drive 
level (rather than all characteristics 
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of anxious and nonanxious individ- 
uals) in relatively simple learning 
situations. The major prediction of 
the theory, that there is an interac- 
tion between anxiety level and task 
complexity, seems to be fairly well 
substantiated by experimental evi- 
dence, although more exact deduc- 
tions have either not been tested as 
vet or have not fared as. well. 
Whether the theory can be success- 
fully applied to more complex situa- 
tions than those for which it origi- 
nally seemed appropriate, as some 
have attempted to do, or whether 
additional variables can be added to 
it and thus broaden its usefulness re- 
mains for future research to deter- 
mine. 
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Psychology has not been by- 
passed in the current general interest 
in ionizing radiations. Since World 
War II a number of laboratories 
maintained by the U.S. government 
have conducted research in this area. 
In addition, several research projects 
have been sponsored by government 
agencies in non-Federal institutions. 
On March 31, 1955, there were in 
progress no than seven such 
separate projects having no security 
classification (50). All of this activity 
would seem to warrant a brief review 
of the problem. 


less 


The Stimuli 


The biological effects of high en- 
ergy radiations are ascribable pri- 
marily to changes brought about in 
cells by ionization, detined as the 
removal of electrons from atoms. 
Different types of radiations produce 
biological effects differing primarily 
quantitatively, rather than qualita- 
tively. Two general classes of radia- 
tions may be distinguished. 

1. Material radiations consist of 
streams of particles which transfer 
their kinetic energy to the targets 
which they strike. The particles dif- 
fering in mass and/or electrical 
charge are neutrons, alpha particles, 
electrons (beta particles), deuterons, or 
protons. These radiations have been 
utilized only very rarely in behavioral 
studies. 

2. Electromagnetic radiations con- 
sist of oscillating electric and mag- 

1 | wish to express my gratitude to Dr. S. R. 
Tipton of the U.T. Department of Zoology 


for critically reading portions of the manu- 
script. 


netic fields. They do display also 
corpuscular (photon) properties. Psy- 
chologists are familiar with “light” 
rays which lie in the frequency range 
of 10 cycles per second (wave- 
length range 9X107-°>—4 107-5 cm.). 
Radiations above 10" cycles per sec- 
ond are capable of ejecting inner 
electrons from atoms. Radiations in 
the 10'*— 10°" cycles per second range 
(10-*—10-'° cm.) are called X rays; 


those between 10'*—10” cycles per 


second (10-*—10-" em.) gamma rays 
(the latter are usually produced by 
oscillating currents within the atomic 
nuclei themselves). Gamma_ rays 
often accompany the disintegration 
of radioactive substances. 

The relative biological effectiveness 
of various radiations is a function not 
only of the total number of ions 
formed, but also of the spatial dis- 
tribution of the ions in the tissues. 
The terms /:mear ion density or linear 
energy transfer are used to express 
the relative density of ionization 
per unit length of tissue. Beta and 
gamma rays produce 6.3-11 ions 
per micron of tissue, 1,000 kv. X rays 
approximately 15, 200 kv. X rays 80 
and lower voltage X rays a still 
higher number. lonization following 
neutron radiations produces up to 
9,000 ions per micron of tissue (26, 
p. 118). Biological effectiveness of 
radiation increases, decreases, or is 
independent of linear energy transfer. 
Thus some activities are affected 
more by alpha particles than by 
gamma rays, while in other functions 
the reverse may be the case. In mam- 
mals we usually find that effective- 
ness increases with ion density. 
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Measurement of Radiations 

Ideally we would like to measure 
the actual amount of ionization in 
tissues, but this is not possible. We 
must be satisfied with specifying the 
physical characteristics of the source 
and the target. JTonization of air is 
actually approximated for ionization 
in tissues. In the case of X and 
gamma rays the unit roentgen, r., is 
defined as that quantity or dose of 
X or gamma radiation which pro- 
duces in 0.001293 g. of air one elec- 
trostatic unit of ions (37, p. 90). 

In the case of material radiations 
a different unit is used, the roentgen 
equivalent physical, rep, which is 
“that quantity of ionizing radiation 


which will produce 1.610" ion 


pairs per gram of tissue’’ (37, p. 436). 
Occasionally, the roentgen equivalent 
man, rem, unit is used which is that 
“quantity of radiation which when 
absorbed by man produces an effect 
equivalent to that produced by ab- 


sorption of one roentgen of X or 
gamma radiation” (37, p. 436). 

A few values will be cited to make 
the roentgen unit more meaningful. 
The safe human daily whole-body 
exposure has been set between 0.05 
—0.25 r. per day (37, p. 436; 64, p. 
89). The threshold for the mitotic 
effect in the grasshopper is 8.0 r. 
(64, p. 89). The 30-day 50 per cent 
lethal dose after 100-250 kv. X-ray 
whole-body exposure is about 315 r. 
for the dog, around 500 r. for man 


(55, p. 930). 


GENERAL PRINCIPLES 
OF RADIOBIOLOGY 

It was pointed out previously that 
radiation-induced effects result pri- 
marily from ionization producing 
physicochemical changes in the liv- 
ing cells. Two general theories con- 
cerning the mode of action of radia- 
tion have been put forward. Accord- 
ing to the farget theory certain mole- 
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cules of the cell are especially radio- 
sensitive and it is the change in these 
specific parts which accounts for the 
observed radiation effects. Opposing 
theorists suggest that radiation af- 
fects the cell as a whole by releasing 
certain chemical agents which inter- 
fere with the normal cell metabolism. 
Actually there is evidence support- 
ing both viewpoints. For purposes 
of this review it is not necessary to 
examine this problem any further. 

The following variables are im- 
portant in the study of radiation ef- 
fects: (55) 

1. Quantity. In most instances 
effects are directly related to the dose. 

2. Rate of delivery or dosage (sum 
of doses accumulated over a period 
of time). In most cases effectiveness 
of a given dose decreases with a de- 
crease in rate of exposure. Recovery 
may account for this. For example, 
in the monkey a single dose of 7,500 
r. applied to the spinal cord produces 
paraplegia, but two daily 5,000 r. 
doses or five daily 3,000 r. doses are 
required (48). 

3. Type of radiation. Usually in 
mammals effectiveness is directly 
related to the specific ion density of 
the radiation. 

4. Manner of exposure. Responses 
to total-body irradiation are differ- 
ent from those in which only a se- 
lected part of the organism is ex- 
posed. Shielding of certain parts of 
the body (spleen, extremities, etc.) 
can decrease the effectiveness of 
total-body exposure. This is espe- 
cially important in the study of the 
effects on the c.n.s. since doses larger 
than the median lethal total-body 
dose are necessary for changes to be 
apparent. 

5. Time after exposure that observa- 
tions are made. Many of the radiobio- 
logical effects exhibit latencies. These 
may be of varying order of magni- 
tudes ranging from seconds to years. 
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6. Species differences. The 30-day 
LD;,? for totai-body X irradiation for 
the rabbit is approximately 800 r., 
for the guinea pig 200-400 r., rat 
600-700 r., monkey 500 r. (55, p. 
930). 

7. Sex differences and individual 
differences within the same species. For 
example, the same dose of X rays 
kills more male than female mice, 
but affects the weight of females to a 
greater extent (7). 

8. Conditions of the organism. Con- 
ditions which may be called ‘‘stress,”’ 
i.e., deviation from normal resting 
state, usually enhance effectiveness 
of radiations. Vitamin deficiencies, 
infections, low temperatures in 
acclimated animals, 


un- 
exhaustive ex- 
ercises, adrenalectomies all seem to 
increase radiation effects. 

9. Drugs and anoxia. Certain 
drugs like cysteine, glutathione, alco- 
hol, and anoxia 
radiation effects. 

10. Reproductive activity of the tis- 
sues. As early as 1906 Bergonié and 
Tribondeau hypothesized that pro- 
liferating are usually most 
radiosensitive. We find, for example, 
that while the nervous svstem of 
adult organisms is relatively radio- 
resistant the embryonic neurons are 
extremely radiosensitive. 

Radiation sensitivity varies con- 
siderably from tissue to tissue. For 
a detailed discussion the reader may 
consult the radiation literature. We 
shall mention here only a few of the 
effects, of interest to the psycholo- 


gist. 


actually depress 


tissues 


The hematopoietic system is ex- 
tremely radiosensitive. A decrease 
in the number of circulating Ivmpho- 
cytes is one of the most sensitive in- 
dicators of radiation overexposure. 
Other blood components also show 

? Dose required to kill 50 per cent of the 


animals during the first 30-day postirradiation 
period, 
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pathological changes. Hemorrhagic 
manifestations are also quite com- 
mon after acute irradiation. Vascu- 
lar changes are major contributors 
to the brain -pathologies observed 
after large doses of irradiation (9). 
Generalized circulatory changes are 
only minor after median lethal doses, 
but with larger doses the effects are 
more pronounced (56). 

There is some disturbance in water 
metabolism. Several studies have 
reported changes in water intake 
after X irradiation (15, 53, 54). 

The gastrointestinal tract is ex- 
tremely radiosensitive. Anorexia, 
nausea, and vomiting are among the 
clinical symptoms of radiation sick- 
(overexposure to radiation). 
Depression of food intake and a loss 
of body weight can be observed in ir- 
radiated animals. The magnitude 
and duration of the depression are a 
function of the dosage (63). Loss of 
body weight can be thus used as an 
indication of radiation sickness. 

The endocrine glands, except for 
the gonads, are relatively resistant 
to radiation damage. However, radi- 
and they 
give rise to the well-known pituitary- 
adrenocortical (56). 

The cornea, conjunctiva, and the 
lens of the eve are also quite radio- 
sensitive but the latency of human 
radiation cataracts may be measured 
in terms of vears (56). 

Muscle is very resistant to radia- 
tion. The nervous system is also rela- 
tively radioresistant. Both will be 
considered in greater detail further 


on. 


ness 


~ ae ” 
ations act as stressors 


stress 


response 


PRE- AND NEONATAL RADIATION 

An excellent review of the effects 
of prenatal irradiation has been writ- 
ten by L. B. Russell (61). 


One of the crucial variables in pre- 
natal irradiation is the stage at which 
Russell (61) divides 


exposure OcCcUrs., 
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the mammalian gestation period into 
three stages; preimplantation, major 
organogenesis, period of the fetus. 
In the rat these periods correspond 
to the following postconception days: 
0-7, 8-15, 16-term. During the pre- 
implantation period radiation  pro- 
duces a high percentage of prenatal 
deaths, but the survivors are usually 
normal. The radiation during the 
period of major organogenesis results 
in lower prenatal mortality, but it is 
the most sensitive period for the pro- 
duction of morphological abnormali- 
ties. Radiation during the period of 
the fetus produces lesser changes. 
Among the most sensitive systems 
during the prenatal period is the cen- 
tral nervous system. Russell (61) 
quotes studies dating back to 1907 
which show marked morphological 
changes following X irradiation. In a 
series of studies on rats and mice Hicks 
(29, 30) showed that X irradiation 


during different stages of the gestation 
period affects different parts of the 


nervous system. There seem to be 
critical periods for abnormalities of 
various types. Irradiation during the 
first eight days of embryonic life 
produced no effects on the n.s. of 
surviving animals. Irradiation on 
the ninth day resulted primarily in 
anencephaly ; on the tenth day in en- 
cephalocele and cerebral deforma- 
tion; on the eleventh day it narrowed 
the aqueduct, produced hydrocephaly 
or encephalocele; on the thirteenth 
to the sixteenth day the basal gan- 
glia, cortex, hippocampus, and cor- 
pus callosum were damaged. From 
the sixteenth day of gestation through 
the neonatal period the cerebellum is 
especially radiosensitive. Hicks em- 
phasizes, however, that the above 
periods are only indices of the most 
frequently occurring pathologies and 
that there is no exact relationship 
between age of irradiation and spe- 
cific malformations. Also it should 
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be pointed out that it is rather diffi- 
cult to determine the precise age of 
embryos. Wilson and co-workers 
(68, 69) have shown that neural dam- 
age is directly related to dosage. 
They irradiated rats on the ninth and 
tenth day of gestation with doses 
ranging from 25 to 400 r. On the 
ninth day 25 r. produced ocular mal- 
formation in only a small percentage 
of animals; 50 r. affected 72 per cent 
of the animals, 100 r. produced 
anophthalmia, microphthalmia, or 
other ocular malformations in 90 per 
cent of the animals; 200 r. proved 
fatal to all embryos. Brain damages 
showed similar trends. The data for 
the animals irradiated on the tenth 
day of gestation were similar, except 
that the doses required were higher. 
Fifty r. had little effect, but 100 r. 
resulted in anomalous eye develop- 
ment in 75 per cent of the cases. 

In this connection a study by Rugh 
et al. (59) is of some interest. Rat 
fetuses 13.5 days old were exposed to 
300 r. of X irradiation. In animals 
examined four hours after exposure 
the retinae revealed massive damage. 
On the other hand, animals examined 
six to seven davs after birth had few 
signs of injurv. Apparently a re- 
covery process took place not by re- 
pair of the damaged cells, but by 
proliferation of the more radiore- 
sistant precursor neuroectoderm cells. 

There are a number of clinical re- 
ports of various abnormalities such 
as microcephaly, hydrocephaly, men- 
tal deficiency, ocular malformations, 
blindness and other types of neural 
malformations which are ascribed to 
fetal NX irradiation (25, 49). Micro- 
cephaly is the most frequently re- 
ported abnormality—17 out of 25 
abnormal cases in one study (49). In 
some clinical studies, however, no 
damage is reported following pelvic 
irradiation during pregnancy (61, p. 
909). It is possible that the exposure 
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in the latter cases occurred after the 
critical period. In the study of 30 
pregnant women who showed one or 
more major signs of radiation follow- 
ing the Nagasaki atomic bomb blast, 
four out of sixteen surviving children 
showed signs of mental retardation 
(70). The report does not specify the 
nature or extent of the deficit. 

So far only two studies have been 
reported which measured specific be- 
havioral consequences of prenatal 
irradiation. Levinson (42) fetally ir- 
radiated rats with 300 to 600 r. X 
ravs on the 1lith, 13th, 15th, 17th, 
and 19th postconception days. When 
the animals were 50 days old they 
were tested on a Lashley Type III 
maze. Learning measured in terms 
of number of trials necessary to reach 
a criterion, number of errors, and 
time spent in the maze was impaired 
with the deficits directly related to 
Radiation on the 

produced the greatest 
This agrees roughly with 
Hicks’ timetable for cortical damage 
(29, 30). Variability was larger in the 
experimental groups than in con- 
trols. Tait et al. (65) XN-irradiated 
rats during the final week of preg- 
nancy using 30, 90, 180, and 360 r. 
The offspring of the animals receiv- 
ing 90 or more r. 
poorer 
animals. 

Summary. While there is a great 
deal of evidence for the relative radi- 
osensitivity of the fetal nervous sys- 
tem, our behavioral data are rather 
scant. We do not know what kinds of 
activities aside from maze learning 
are affected nor the lower thresholds 
of radiation-induced changes. The 
latter may be of practical signiti- 
cance. 


the radiation dose. 
13th day 


changes. 


were significantl, 


maze learners than control 


THe ADULT NERVOUS SYSTEM 


It has been known for a long time 
that the adult nervous svstem is rela- 
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tively radioresistant. Doses in the 
median total-body lethal range pro- 
duce no observable neural changes. 
However, with larger doses, in the 
case of mammals generally over 1,000 
r., a number of investigators have ob- 
tained definite signs of neural degen- 
eration in a variety of organisms 
man, monkey, dog, rat, rabbit, fish, 
(1, 2, 6, 9, 10, 12, 27, 31, 45, 46, 48, 
59,61). In general the amount of de- 
generation observed is directly re- 
lated to the dose and conversely an 
indirect relationship holds for the 
latency (2, 9, 10, 12, 31, 59). With 
relatively low doses, a few thousand 
r., the latency may be a matter of 
months, a vear, or longer (2, 10, 27, 
45). Many investigators assume that 
the initially observed neuronal dam- 
effect resulting 
from damage to the vascular system 
in the brain (6, 9, 27, 58, 60). Some 
recent however, deny the 
necessity of this assumption (2). It 
might be mentioned here also that 
because of certain methodological 
the use of radioactive 
cobalt has been proposed for the pro- 
duction of circumscribed brain le- 
sions (62). 

Aside from histological studies, we 
have functional 
changes. Reflex excitability decreases 
as a function of dose, with high doses 
abolishing the retlex completely (19, 
20). Frequently enhancement pre- 
cedes the depression (2, 23, 39). But 
again it should be emphasized that 
median total-body lethal doses pro- 
duce no easily 
(13 }. 

In a study in which the heads of 
rabbits were irradiated using 12,500 
r. (23) after a latent period of 30 
minutes a convulsive phase’ with 
grand mal seizures appeared. This 


age is a secondary 


studies, 


advantages 


information on 


measurable changes 


was followed by a somnolent phase 
of two hours’ duration in which the 
animals were quite inactive. Finally, 
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in the last stages before death, ataxia 
was the most pronounced symptom. 
Changes in equilibrium and disorien- 
tation in space have been reported by 
a large number of investigators (1, 
10, 46, 52, 58, 60). This is in accord 
with several histological reports that 
the brain stem and cerebellum are 
the most frequent sites of radiation 
necrosis. Hemi- or quadriplegia is a 
common symptom after large doses 
(2, 10, 46, 48, 58). 

EEG changes have been recorded 
by several investigators (2, 9, 13, 39, 
58), but again the lower threshold is 
above the median total-body lethal 
dose. The typical pattern is similar 
to that seen in seizures, i.e., periodic 
spikings, high amplitude slow waves. 

The most sensitive parts of the 
brain are the hypothalamus, glial 
cells, brain stem including the me- 
dulla and the cerebellum (2, 6, 9, 10, 
12, 31). The cortex is more radio- 
resistant than these structures, and 
this is of course significant in be- 
havioral work. 

The peripheral nervous system is 
even less sensitive than the c.n.s. to 
radiations (32). Doses below 10,000 
r. are ineffective. It takes 45,000- 
75,000 r. to abolish nerve conduction 
in peripheral fibers (22). The auto- 
nomic n.s. responds with a vagotonia 
after an initial short duration sym- 
pathicotonia (66). A slight decrease 
in pulse amplitude has been reported 
already after 750 r. Also certain para- 
sympathomimetic effects may be ob- 
served during radiation sickness (56, 
p. 996). 

Skeletal muscles are also relatively 
radioresistant. With below 
6,000 r. no abnormality may be ob- 
served (43). Gerstner, et al. (21) ap- 
plied 50,000 r. to the rabbit gastroc- 
nemius and they noted that fatigue 
effects could be observed only when 
high performance was demanded by 
using a heavy load or a high fre- 
quency of stimulation. 


doses 
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BEHAVIORAL CHANGES 


Almost since the discovery of X 
rays investigators have reported vari- 
ous changes in organisms following 
radiation. Lyman, ef al. (46) in their 
exhaustive 1933 review of neural 
changes refer to a study by Tark- 
hanov who in 1896 observed quieter 
behavior in flies following X irradia- 
tion. There is also an abundance of 
individual clinical case studies in 
which radiation was applied for ther- 
apeutic ‘purposes. This review will 
emphasize primarily those studies, 
however, which were designed spe- 
cifically to investigate behavioral ef- 
fects. The latter includes those 
phenomena customarily included in 
the field of psychology. 


Learning and Performance 


The first attempts to assess the 
effects of radiations on learning were 
performed in Pavlov’s laboratory. 
Nemenow (51, 52) irradiated the 
head of one dog with a dose of 1,500 r. 
There was only a slight drop in 
his salivary CR’s. After an addi- 
tional 2,200 r., however, the CR’s 


practically disappeared for a period 


of tive weeks. A second dog received 
3,500 r. then again 2,800 r. and the 
results were essentially similar to 
those seen in the first animal. Ly- 
man, et al. (46) X-irradiated the oc- 
cipital part of the head of four dogs 
with massive doses of 17,000-18,000 
r. after their CR’s had been stabil- 
ized. All of the animals showed a 
temporary decrease in their salivary 
CR’s, but the onset and duration of 
this decrement varied. Two of the 
animals (“excited types’) actually 
showed a rise in CR’s preceding the 
drop. The strength of the responses 
also varied as a function of the type 
of CS. One of the animals kept alive 
for six months after the treatment 
exhibited a second lowering of CR’s 
following the recovery from the first 
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decrease. This latency in the gross 
pathological manifestations is 
sistent with the other investigations 
discussed in the previous. section. 
The change in the CR’s occurred dur- 
ing a period when the S exhibited 
ataxia, impaired vision, circus move- 
ments, and general deterioration of 
behavior. It was difficult to test the 
Interpretation of the data from 
the whole study is obscured by the 
observation that in three Ss not only 
the CR’s but the UR’s also showed a 
drop. 

In a study for which an abstract 
only is available, Harlow (28) re- 
ports that radon tubes inserted into 
the cortex of ten rhesus monkeys 


con- 


dog. 


pre duced progressive loss on delaved 


reaction, patterned string tests, and 
simple position habits. No data are 
given for the dosage used. It 
probably quite large in view of other 


Was 


negative findings reviewed below. 
No further work was done in this 
field until after World War II. 
Furchtgott (16) tested rats exposed 
to 200—500 r. of total X radiation in a 
four-unit water maze. Neither acqui- 
sition nor retention using several cri- 
terion measures was affected by the 
treatment. Arnold (3) exposed the 
heads only of rats to 300-800 r. and 
tested them for retention of a 14- 
unit T-maze habit and other irradi- 
ated Ss were tested for the learning 
of the habit. No statistically signif- 
icant changes were found... Fields 
(14) studied performance on elevated 
runwavs, 32- 40-choice-point 
elevated T-mazes, and a 10-choice-5- 
vertical maze of some 500 male 
rats which had received doses rang- 
ing from 100—1,000 r. On the whole 
radiation had little effect on the per- 
formance of the animals except for a 
decrease in the speed and amount of 
activity immediately following irra- 
diation which was probably due to 
the general radiation malaise. Davis 
(11) tested rhesus monkevs in the 


and 


stace 
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Wisconsin General Test apparatus 
following X irradiation but was unable 
to find any impairment of perform- 
ance on discrimination-type tasks. 
In a series of studies sponsored by 
the U.S. Air Force School of Aviation 
Medicine (34, 57), monkeys were 
tested on acquisition, retention, and 
transter of multiple discrimination 
problems immediately and 150 days 
after exposure to sublethal and lethal 
doses of X rays. Again the reported 
results failed to demonstrate any de- 
leterious effects. The only deficit 
that was noted was an increase in re- 
action time. 

Garcia et al. (18) established a con- 
ditioned aversion to a_ saccharine 
solution which was associated with 
exposure to gamma irradiation. Ex- 
perimental animals had saccharine 
solutions in their cages while being 
exposed for six hours in the gamma 
field, while control Ss had tap water. 
Preference then tested for 63 
postirradiation days. The control 
group showed no loss of their natural 
preference for saccharine, while ex- 
perimental Ss exposed to only 30 r. 
showed a significant drop in_ their 
saccharine intake. The authors hy- 
pothesize a general behavior dis- 
turbance during radiation which be- 
came associated with the taste stim- 
muli. It should be pointed out that 
the animals were being exposed at a 
very slow rate and some of the general 
radiation malaise might have been 
effective for a sufficient length of 
time for the conditioning. The effec- 
tiveness of the low dose used is sur- 
prising, however. 

Jones et al. (33) measured the ef- 
fects of 200-1,000 r. of whole-body 
X irradiation on activity-wheel per- 
formance, using 194 rats. Data were 
analyzed separately for animals who 
survived the eight-week experimental 
period and those that succumbed to 
radiation injury. Rats which died 
during the first nine postirradiation 


Was 
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days showed a gradual decrease in 
activity until death. Those that sur- 
vived nine days, but died subse- 
quently showed a decrease immedi- 
ately after irradiation followed by a 
recovery and a second depression of 
activity prior to death. All of the 
surviving animals 200-680 
r.—all animals with higher 
died) showed a decrease in activity 
postirradiation. The 200-300 r. 
groups recovered completely by the 
fifth day. The higher-dose groups 
also showed a partial recovery dur- 
ing the first postirradiation week, but 
they exhibited a second depression 
during the third week. The 400-450 
r. groups attained normal levels of 
activity four weeks after irradiation, 
the 681 r. groups after eight weeks. 
In general there was a direct rela- 
tionship between degree and dura- 
tion of activity depression. 

In another study the same group 
of investigators (36) tested the et- 
fects of 300—1,000 r. X irradiation on 
exhaustive swimming exercise. The 
rats were placed into a 24-gallon 
tank where they were forced to swim 
until they were exhausted and sank, 
remaining below the surface of the 
water for longer than 30 seconds at 
which time they were retrieved. 
Length of swimming time before 
sinking was measured. Following 
radiation, performance gradually de- 
creased, reaching a minimum level 
during the third to fourth postirradia- 
tion week. From then on there was 
a gradual return to the normal level 
which was attained by the ninth 
week. While the depression was di- 
rectly related to dose, the 300 r. 
group barely differed from control 
animals. The 500 r. group, however, 
showed a significant drop and the 
higher r. animals in turn differed sig- 
nificantly from the 500 r. group. 

Furchtgott (15) subjected adoles- 
cent rats to 300 and 500 r. of X rays 
and tested their swimming speed in 


(doses 


doses 
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a 12 it. straight-away tank for 13 
days. The 300 r. group did not differ 
from the controls, but the survivors 
in the 500 r. group were significantly 
slower. 

Vogel (67) daily irradiated with 50 
r. X rays six aggressive mice each ot 
whom, prior to the treatment, always 
defeated submissive animals. Even 
after irradiation the aggressive ani- 
mals continued to be dominant until 
shortly before death. 

McDowell (47) observed a reduc- 
tion in “other-animal involved” be- 
havior and visual attention to the 
activity of other animals following 
400 r. of X irradiation in 10 rhesus 
monkeys. The animals also showed 
fewer instances of aggression and a 
greater incidence of lethargy. All of 
these symptoms are easily under- 
standable the general 
malaise which is associated with ra- 
diation. 

Leary and Ruch (38) exposed 18 
rhesus monkeys to 200-400 r. of total- 
body X irradiation. Cage-crossings 
were not affected. On the first post- 
irradiation day only for the 400 r. 
animals (the were not 
served) scratching, grooming, and oth- 


considering 


others ob- 
er signs of activity were depressed—a 
Mechanical 
puzzle manipulation did not produce 
statistically significant differences be- 


sign of general malaise. 


tween pre- and postirradiation peri- 
Pedometer manipulation was 
impaired in the 400 r. animals (others 


ods. 


were not tested) and = surprisingly 
weight-pulling, supposedly a measure 
of general strength, did not decrease 
in all animals. 

In general it may be said that ra- 
diation produces a certain amount ol 
depression in activity which should 
be most apparent when motivation 
is low: or when the task requires a 
great deal of effort as in the exhaus- 
tive swimming experiment (36). The 
latter effect would tend to parallel 
Gerstner’s, et al. (22), findings on the 
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effect of N 
contraction. 


irradiation on muscular 

There is a puzzling report of a 
clinical study of 120 patients who 
had received several single doses of 
30-50 r. during a 7-10 day period 
totaling 150-250 r. of diencephalic X 
irradiation (4). Immediately follow- 
ing the treatments the typical svmp- 
toms were numbness, apathy, and 
tingling the 


sensations in head re- 


The day after the irradiation 
however, most patients reported spon- 


gion. 


taneously that they felt euphoric, ac- 
tive, and tranquil. This 
state lasted from a week to several 
months. Most of the treated pa- 
tients were neuropsychiatric 


generally 


Cases 
with diagnoses of urticaria, migraine, 
However, in addi- 
tion, two medical collaborators sub 


jected themselves to 100 r. adminis- 


depression, fic. 


tered to the diencephalon and they 
also eXperien ed the same changes as 
the patients. Sixty-one of the pa- 
tients reported changes in their sleep 
patterns. The sleep on the night fol- 
lowing the treatment 
characterized as 


Was usually 
“extremely deep,” 
“leaden.” In addition 
37.5 per cent of the Ss reported sexual 


' 


“heavy,” or 


changes, notably improvement in 
libido, potency, and the menses. 
The authors ascribe these changes 
to hypothalamic stimulation — pri- 
marily of the anterior, parasvmpa- 
thetic nuclei, a finding in accord with 
the frequently 


induced 


reported radiation- 


(66). These re- 


sults, if confirmed by other investiga- 


vagotonia 


tors, should have therapeutic impli- 
cations. They also raise many ques- 
tions of interest to the experimental- 
ist working with animals 
have practically no data on emo- 
tional behavior following radiation. 

Summary. The lack of any dra 
matic changes in learning functions 
following sublethal or just lethal 
total-body X irradiations reported 
by several experimenters agrees ver 


since we 
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well with similar neurophysiological 
observations on the resistance of the 
nervous system in that dose range. It 
takes doses which are well above the 
median total-body lethal range to 
produce any neural changes and then 
there is usually a considerable la- 
tency. In the one study in which 
there was an 18-month time lapse 
between the treatment and testing, 
no drastic decrements took place (14). 
Whether a longer period would have 
any effects is an open question. While 
acquisition, retention, or transfer are 
not affected, performance indices 
which utilize gross muscular activity 
are impaired to some extent and this 
impairment persists for a number of 
months (in the study of swimming 
endurance |36] up to nine months for 
the most heavily irradiated group). 
Another factor to be considered is 
what might be called, for the lack of a 
better name, general malaise, which 
includes a lack of motivation to re- 
spond to stimuli or initiate activity 
which is present immediately follow- 
ing radiation and appears again dur- 
ing the second week in more heavilv 
irradiated animals. This is accompa- 
nied also by a loss in appetite and 
drop in body weight. 


Sensory Functions 


In the clinical literature 
there are reports of improved hearing 
following X irradiation. In the early 
thirties Girden (24) working in Cul- 
ler’s laboratory attempted to investi- 
gate this problem using dogs in the 
classical conditioning setup. Standard 
psychophysical procedures were em- 
ployed to obtain absolute intensity 
thresholds. Subsequently the heads 
of twelve animals were irradiated. 
The study was exploratory in nature 


ITearing. 


and there was no systematic design 
to test radiation factors. Eight ani- 
mals were irradiated using 80-100 
kv. peak voltage and 5 ma., while 
four animals got roentgen ravs gen- 
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erated at 200 kv. peak and 5 ma. One 
animal received 5 r. every day for 
five months, one 5 r. for four days, 
one anywhere from 100—-1,100 r. on 
seven days, spaced one to seven days 
apart, and so forth. The total dosage 
varied from 20-11,100 r. The ani- 
mals which were irradiated with the 
80-100 kv. rays all showed a tran- 
sient gain in acuity which averaged 
5.5 decibels after a latent period of 
seven to eleven days. Dosage was ap- 
parently notinvolved since thechanges 
appeared even after the surprisingly 
low value of 20 r.. None of the Ss ir- 
radiated at 200 kv. showed any im- 
provement in acuity. 

In a second study Brogden and 
Culler (5) examined more criticalls 
the effect of dose and also the fre- 
quency variable. Ten animals were 
irradiated at nine different intensities 
ranging from 75 to 675 r. The gain 
in acuity was independent of the 
dose, varving from 3.84 db to 7.87 
db. The duration of the gain was 
also independent of the dose and it 
varied from 8.0 to 10.3 days. The 
latency, however, was inversely re- 
lated to the dose. At 75 r. it was 7.6 
days and at 675 r. it was 2.6 days. 

To explore the mechanism of the 
phenomenon, two dogs were hy- 
pophysectomized and irradiated ; audi- 
tory tests were conducted on a dia- 
betic subject and a normal dog when 
blood-sugar levels were high and low 
(by injection of insulin); and blood- 
sugar levels were measured before 
and after irradiation in one dog. In 
all of these cases hypoglycemia was 
associated with lower auditory thresh- 
olds. The authors hypothesize that 
low sugar levels lessen density and 
viscosity of cochlear fluids, and 
thereby decrease resistance to incom- 
ing vibrations, and perhaps also the 
ionic conditions in the cochlea affect 
the magnitude of the cochlear poten 
tials. 


Vision. Fields (14) found no ef- 
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fects on brightness or acuity discrim- 
ination in rats following X irradia- 
tion. Russian workers (35) have re- 
ported that dermal X irradiation in- 
creases the threshold to dark adapta- 
tion and that this effect persists for 
several days. Lenoir (41) tested dark 
adaptation in 11 patients following 
therapeutic irradiation. In all cases 
there was a decrease in dark adapta- 
tion which was independent of the 
dose (2,400—6,240 r.). The changes 
could be detected for 20 to 36 days. 
The author ascribes this reduction in 
dark vision to a drop in vitamin A 
concentration which follows the X 
irradiation. Furchtgott (17) tested 
brightness discrimination in a Lash- 
ley jumping box under conditions of 
low illumination following 369-469 r. 
of X irradiation. The performance of 
the irradiated rats was slightly infe- 
rior to that of control animals. It 
should be noted here also that Cibis 
et al. (8) found that rod cells are con- 
siderably more radiosensitive than 
cones. 
1,700—2,000 r. while the 
for cones is 10,000—30,000 r. 

The work on cataract formation 
has been reviewed adequately (40) 
and it is omitted here since the stud- 
ies involve primarily morphological 
changes. 

Other senses. 


senses is scant. 


Destruction of rods required 


threshold 


The work on other 
Lindemann (44) ob- 
served fifteen patients who received 
therapeutic X-ray treatments for 
tumors in the oral cavity. Taste 
sensitivity and in some cases odor 
sensitivity were depressed for several 
months. In an unpublished study 
Furchtgott found some indication of 
lowered thresholds to electric shock 
in rats following sublethal doses of 
whole-body X irradiation. 
Summary. While there is some evi- 
dence for changes in sensory func- 
tions notably hearing and scotopic 
vision after irradiation, the available 
data are quite limited. Much more 
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research is necessary in the various 
sensory modalities to determine the 
factors, if any, which affect percep- 
tion. 


SUMMARY AND CONCLUSIONS 


The published studies pertaining 
to the behavioral effects of high-en- 
ergy radiations were reviewed. More 
studies have actually been performed 
in this area. The author knows of 
several additional performed 
by himself and by others, but the 
negative results have discouraged 
the workers from publishing them. 

Underlying any the 
behavioral effects of radiation is the 
relative radioresistance of the adult 
nervous system. Total-body 


ones, 


discussion of 


doses 


in the median lethal range do not 
seem to produce any gross neural dys- 


functions. Except for the instances 
in which the body is shielded and the 
radiations are applied to the head 
only, death will intervene long before 
anv neural changes can be observed. 
Thus we will not find any significant 
behavioral changes in those activities 
which are mediated directly by the 
We have reviewed 
several studies of learning by differ- 
ent investigators which seem to bear 
this out. Actually it is possible that 
an investigation will show a decre- 
ment in learning following radiation. 
However this would be primarily a 
reflection of the change in the non- 
associative learning factors, i.e., Moti- 
vation and perception of the stimuli. 

We have pointed out that radiation 
produces changes in the blood and 
body fluids, tract, 
and some of the endocrine secretions. 
Thus the homeostatic energy-con- 
trolling mechanisms are affected and 


nervous system. 


gastrointestinal 
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we should find, therefore, changes in 
motivation and performance. We 
have indeed seen that some of these 
functions are altered. Food and 
water intake, exhaustive swimming 
exercise, activity wheel and _ pe- 
dometer performance, and social be- 
havior changes have been reported. 
There are still a number of problem 
areas here such as_ emotionality, 
motivation, other than hunger and 
thirst, which have not been investi- 
gated. Here we should mention again 
that radiation seems to lead to the 
pituitary-adrenocortical stress reac- 
tion and that the hypothalamus and 
the autonomic n.s. are relatively 
more sensitive than the cortex. It 
would seem also that performance 
which requires a large expenditure of 
energy or where extrinsic incentives 
are very small will be affected the 
most by radiations. 

In the sensory field some experi- 
mental work has been reported on 
hearing and vision and we have also 
clinical data on these and other 
modalities. On the whole, however, 
there are large gaps here. 

The great sensitivity of the de- 
veloping nervous system was briefly 
discussed. The quantity of behavi- 
oral data not approach our 
knowledge of morphological changes. 
We have only two studies on maze 
learning in rats. It would seem that 
this area should be explored in greater 
detail and functions other than maze 
learning could be explored. 

The genetic aspects of radiation 
were not considered since we have no 
data here on variables which are con- 
ventionally classified as psvchologi- 
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The recent paper by Meehl and 
Rosen (3) presents a rationale for 
evaluating the predictive efficiency 
of psychometric instruments which 
should be of interest and importance 
in clinical research. 
The purpose of this comment is to 
emphasize the principle of the de- 
pendence of 
administrative 


and personnel 


statistical criteria on 

policy in- selecting 
the appropriate criterion among the 
cases which they have effec tively pre- 
sented. 

The statistic for 
evaluation for predictive efficiency is 
the base rate 


basic reference 
, according to Meehl and 
Rosen (3, p. 194). Evaluation of any 
predictor requires comparison of re- 
sults based on prediction with the 
base rates prevailing in the situation. 
Thus, if thousand candidates 
were available for military service 


one 


and the base rate of noneffectiveness 
were 5°>, 950 successful candidates 
might be expected without screening. 
Now, if a screening device operated 
to admit less than 950 successful can- 
didates in the same situation, Meehl 
and Rosen would consider such a test 


less efficient than the base rate. 


Their analysis considers three sepa- 


rate cases. The first is efficiency im 
detecting cases of poor adjustment. 
Here they classify as errors of predic- 
tion only the false-positives rejected. 
When the false-positive rate is higher 
than the base rate of noneffectiveness, 
they would conclude that use of the 
screening test would be less efficient 
than no screening at all. The second 


The writers wish to express their appre- 
ciation to Dr. Samuel Fulkerson for con- 
tributing to the discussion which culminated 
in the present paper 


case is efficiency in prediction for all 
cases. Here they classify as errors of 
prediction both the false-positives 
rejected and the false-negatives ac- 
cepted. When the number of success- 
ful cases attained through a sample 
of available individuals is lower as a 
result of screening than could be ex- 
pected according to the prevailing 
base rate, they would consider such 
The third case 
is called efficiency in detecting cases of 
good adjustment. Here only false- 
negatives regarded as errors. 
Thus to the extent that the propor- 
tion of successfuls in the sample ac- 
cepted is greater than expected ac- 
cording to the base rate, they would 
consider efficient. 
They point out. however, that such 
efficiency is relative, inasmuch as it 
purchases increased efficiency of per- 
sonnel accepted at the cost of reject- 
ing some potentially successful candi- 
dates in the screening process. 
Although the point is implied by 
Meehl and Rosen, it seems important 
to emphasize as a general principle 
that the choice of the appropriate 
test of efficiency depends on the poli- 
cies in effect and the purposes of 
required to fulfil them. 
Widespread misunderstanding of this 
principle could seriously impair the 
status of many useful screening and 
prediction programs. All too often 
scientists are too preoccupied with 
considerations of validity, while they 
fail to recognize the practical prob- 
lems facing administrators who uti- 
lize psychometric techniques. On the 
other hand, administrators need to 
understand this principle so that they 
may avoid the error of rejecting use- 


screening inefficient. 


are 


screening to be 


screening 
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ful methods as well as that of accept- 
ing inefficient ones, through faulty 
evaluation. 

With specific reference to induc- 
tion screening of military personnel, 
the manpower administrator is con- 
cerned with supply and demand is- 
sues on one hand, and with the bur- 
den of additional administration and 
loss of productive work due to non- 
effectiveness on the other. In times 
of manpower scarcity, he may be 
pressed to utilize every available 
man. Under such circumstances, he 
would seek to admit the maximum 
number from the pool available. 
‘Vhen the Meehl-Rosen Case 2 would 
be properly applied in evaluating 
prospective screening devices. 

If, however, manpower shortages 
were less pressing, or if the waste 
attributable to noneffectiveness were 
considered too great, the adminis- 
trator might be agreeable to the re- 
jection of some potentially successful 
individuals by a_ screening device 
which could assure a greater propor- 
tion of successful candidates from the 
number admitted than might be ex- 
pected according to the base rate, 
The gross number of successful candi- 
dates for any available sample would 
be less, depending upon the rejection 
rate for the particular screening de- 
vice, but the noneffectiveness rate 
might be reduced. In these circum- 
stances Case 3 would be appropriate 
to evaluate the increase in proportion 
of successful candidates as a result of 
screening and Case 1 could be used to 
evaluate the cost in terms of false- 
positive rate. 

The criterion implied in Case 2 
requires maximization of the number 
of successfuls in relation to the total 


4 


pool available, whereas Case 3. re- 
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quires maximization of the number of 
successfuls in relation te the number 
admitted. Both need to be evaluated 
against the base rate. The former 
criterion may be dictated in circum- 
stances of manpower scarcity, while 
the other would reflect a policy de- 
cision more sensitive to the cost of 
accepting and caring for noneffective 
individuals in hospitals, guardhouses, 
and nonproductive jobs.?. Policy, not 
mathematical reasoning, must dictate 
the appropriate criterion of evalua- 
tion and the proportion of incorrect 
predictions which can be accepted. 

The writers feel that in view of the 
general excellence of Meehl and Ros- 
en’s paper, their oversight in connec- 
tion with their discussion of Case 1 
should be mentioned. They demon- 
strate (3, p. 195) that the use of the 
Danielson and Clark (2) screening 
inventory would result in a decrease 
in the total percentage of correct 
predictions made (from 95°) to 
79.7°%) when comparing the test with 
They do not, how- 
ever, indicate that the screening in- 
ventory has actually succeeded in 
raising the percentage of correctly 
predicted “fails” from 5°% (base rate) 
to 13°. Later they do recognize this 
kind of gain when they demonstrate 
(3, p. 204) that a certain cutting 
score on the Glueck prediction index 
succeeds in correctly identifving de- 
linquents with an accuracy of 92.6°% 
as compared with an expected 20°; 
base rate, even though predictions are 
made tor only 2.4°) of the popula- 
tion. 


the base rates. 


the current 
f the armed serv- 


21t is of interest to note that 
induction screening policy 
criterion de- 


ices emphasizes the second 


scribed (1, 4, 5). 
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There have becn several recent in- 
stances in the psychological literature 
(1, 2, 6, 8, 9) of the use of the statistic 
known as Kendall's tau (T), a non 
parametric correlation coefficient. It 
is to be hoped that its use reflects a 
growing realization among psycholo- 
gists of the inadequacy of the Pear- 
son product-moment coefficient (7) 
in a number of circumstances. 
of these circumstances are: 

1. When the variates to be cor- 
related show sharp departures from 
normality. Although the distribution 
of sample r’s from nonnormal but un- 
correlated populations differs only 
slightly from the normal case (4), it 
may differ considerably when the 
true 7 is not zero, kurtosis rather than 
skewness being the more important 
factor (3). 

2. When the variates to be 
related are unmeasureable according 
to an objective scale, as in the case of 
ratings or preferences of judges, or 
when precise measurement is imprac- 
tical and the raw data must be sets 
of ranks. Under these circumstances, 
the evaluation and interpretation of 
r often requires assumptions which 
it would be imprudent to make. 

3. When there is reason to believe 
that the regression of one variate on 
the other is nonlinear, 7 will tend to 


Some 


COr- 


1 The preparation of this paper was sup- 
ported in part by Research Grants M-658C 
and MH-301 from the National Institutes of 
Health, U. S. Public Health Service. This 
work was done while both authors were at 
the Iowa Child Welfare Research Station 
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Chicago 


underestimate the degree of interde- 
pendence. 

The use of a rank-correlation coef- 
ficient requires no assumptions re- 
garding the form of the distributions 
of the variates and is thus admirably 
suited to the resolution of the difh 
culties posed by the first two circum- 
A rank coefficient also will 
not underestimate a relationship even 
when regression is nonlinear so long as 
the regression function is monotonic, 
which is usually the case in psycho 
logical research. 

These considerations apply both to 
T and to the better-known rank cor 
relation, Spearman’s rho. This paper, 


stances. 


however, will be concerned entirely 
with the former since it has a number 
of advantages over rho, and ts rarely 


discussed in current statistical texts 
The most important of these advan- 
tages is that the significance of a sam 
ple tau (7) can be evaluated with cer- 
tainty in terms of the normal proba- 
bility integral for all but very small 
values of 7. 
limits for T can be determined from 
sample 7's. If the rank-order coefti 
cient 1s regarded as merely a rough 
approximation of r, these consider. 
tions are not particularly important 
When it is used as a test of an hy- 
pothesis for which it alone is appro- 
priate, as is often the case, then these 
advantages, especially the forme: 


Furthermore, confidence 


become significant. 

Tau can also be used for the 
putation of both partial and multiple 
correlation 


CcOoMmM- 


coefficients. l'owever, 
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neither of these measures Is very use- 
tul at present as we shall indicate in 
a subsequent section. 


DEFINITION AND INTERPRETATION 


Tau is defined as 


hen wis the number of items ranked 
nd S=(P—Q), where P is the num- 
ber of item pairs on the order of which 
both rankings and Q is the 
mber on which they disagree. Tau 

+ 1.00 when all possi- 


agree, 


will vary from 
ble pairings are ranking concordantly, 
to —1.00 when all pairings are ranked 
discordantly. 

the rankings 
“ambicuity’’ of eight sentences 
with the rank 


vs of Judge A arranged in the nat- 


(Consider following 


| 


oi the 


made by two judges, 
order 


( s } 


P the one ral ked 
1 bv Judge A, has to its right in Judge 
B's rankings 3 larger ranks and 4 
smaller ranks. We allot 1 for each 
of the larger ranks, and 1 for each 
of the smaller The 
sentence in Judge A’s ranking has to 


The tirst sentence, i.e 


ratiks. second 


its right in Judge B’s ranking 6 larger 
And so 

Alloting 
this fashion, 
{: +6, QO: ? 
—(Q; 
P, the sum of the 

7, and QO, the the 
11. S, which is P—Q, is 
§, we obtain 


ranks and no smaller ranks 


through sentence 7 


ses i 
5 QO; +2, 
sum ol 


AC ord- 


thus 6. ‘ith 7 
\ (1) a 7 of 6/28 =.21. 


he interpretation of 7 follows 


ing to 
readily from [1] since m(m—1)/2 1s 
the total number of item pairs with 
respect to which the rankings can be 
\ given value of T 


sserts 


compared, 
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that the statement “The order in 
which two items are ranked accord- 
ing to one variate (or judge) will be 
the order in which they are ranked by 
another variate (or judge)” will be 
correct (100+ 100T)/2 per cent of the 
time, on the average. 

When there are tied ranks, certain 
adjustments in the computational 
formula for 7 must be made since the 
total number of possible item pairs 
will vary as a function of the ties. If 
there are ties in only one of the rank- 
we arrange the untied ranking 
in the natural order and proceed to 
compute S as before except that the 
numbers in the second ranking, to the 
right of the items under considera- 
tion, which are the same as the rank 
of this item contribute nothing to the 
value of S. When both rankings con- 
tain ties, We arrange either one in the 
natural order and compute conven- 
tionally except that item pairs which 
are tied in the upper ranking also con- 
tribute nothing to the value of S.* 

The major adjustment for tied 
ranks occurs in the denominator of 
{1} as might be expected. The general 
formula for 7 from tied ranks contain- 
ing the adjusted denominator is: 


Ings, 


2 Smith's (12) description of the method for 
calculating t when ties are present is in error 
since it neglects the effects of ties in the upper 
ranking on S. This oversight leads to mark- 
edly unreasonable 7's and distorts the sampling 
distribution by producing too many large 
absolute values of +. For instance, in one of 
Smith’s examples (12, p. 570), he obtains a 
+1.00 despite the fact that one 
ditferences 


corrected 7 of 
between items 
which were rated identically by the other. 
The correct procedure leads toa r of 808, 


judge perceived 


which expresses the high, though not perfect, 
degree of agreement which is present. 
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where 


>, x(v—1), 


> >. u(u— 1 
The computation of V and U will 
be illustrated in the following ex- 
ample. 

In the two sets of ranks below, the 
upper ranking has been arranged in 
the natural order. 

1 2.9 


2 6 


The first item, i.e., the one ranked 1 
in the upper ranking, has five larger 
ranks to its right in the lower rank- 
ing, and none smaller. It is not tied 
with any other item in the upper 
ranking, so its contribution to S is 
+5. The second item has 2 smaller 
ranks to its right, and thus contrib- 
utes —2. Although this item is tied 
in the upper ranking, the pairs which 
are tied are not involved in the con- 
tribution. Similarly for the third 
and fourth items. The fifth item has 
2 larger ranks to its right, but one of 
these, the sixth item, is tied with the 
fifth in the upper ranking, and thus 
does not contribute. The net con- 
tribution of the fifth item is therefore 
only +1 instead of +2. A similar 
procedure for the sixth item leaves it 
with a net contribution of 0. The 
seventh item contributes +1. S$ 
the net total, is 7—6=1. 

V and U are obtained in the fol- 
lowing manner:' the upper ranking, 
from which V is computed, contains 
two sets of ties, one of extent 2 and 
one of extent 3. For the first set, 
v=2, and v(v—1) =2(2—1) =2. For 
the second, v=3, and w2(v—1)= 
3(3—1) =6. The sum of the expres- 
sions v(v—1) in the upper ranking is 
(2+6)=8, and V=3(8)=4. The 


lower ranking also contains two sets 
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of ties, of extents 3.and 5. These will 
enter into the computation of U. For 
these ties, «=3 and 5and u(u—1) =6 
and 20. Hence U=}(6+20) = 13. 

Substituting the computed values 
of S, V and U in [2], we obtain + 
05.3 

When 7 is computed without ad- 
justing the denominator for ties, it 
will always be numerically less than 
when the adjustments are made. The 
use of the uncorrected denominator 
is recommended by Kendall (7) 
when with an objective 
ranking is being determined. In such 
a case, only the judge’s ranking would 
contain ties, and these would theo- 
retically indicate inability to dis- 
criminate the objective order, a fail- 
ing for which the judge should prop 
erly be penalized. In general, how- 
ever, the corrected formula should be 
used since rank correlations 
usually computed when agreement 
rather than accuracy is the issue. 

The procedure for adjusting for 
tied ranks can be generalized to in 
clude cases involving dichotomies. A 
dichotomy may be regarded as 2 sets 
of tied ranks of the extents of the 
number in each of the two categorics, 
and the computational procedure 
need differ from instances in 
which ties are less extensive. How- 
ever, some labor can be avoided by 
the use of the following formulae: 


agreement 


are 


not 


P 


V In(n—1) 2-V lv Pq 
when one of the variates 1s a dichot- 


3 If the 
present 


when ties are 
seems tedious, it should be pointed 
out that rho has no advantage in this respect 
The proper computation of rho from tied 
ranks involves corrections in both 
numerator and denominator, the latter being 
similar in form and effort to that required for 
r. Unfortunately, most texts fail to mention, 
let alone describe, the corrections for rho, 
thereby creating the inaccurate belief that 
none are nec essary. 


computation of + 


also 
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omy consisting of p and (n—p)=q 
members in the categories, or 


a 
fies 8 il 
V (pq) (xy) 

when both variates are dichotomized 
into categories consisting of Pp and q 
members, and x and (n—x) =y mem- 
the 
frequencies in a 2X2 table as for cor- 
related proportions, S will be found 
to equal the difference between the 
products of the frequencies in the 
diagonal cells. 


bers. In this case, if we arrange 


TESTS OF SIGNIFICANCE! 

- The distribution of sample 7's for 
uncorrelated variables rapidly ap- 
proaches normality ind is. satis- 
factorily approximated, when > 10, 
by the normal distribution with a 
mean of zero and a variance defined 


as 


$7+-10 
Qn(n—1 


When ties are present, the formula for 
the becomes compli- 
cated.’ If the number of ties is small, 


Variance of fr 


‘In this paper all significance tests are 
attributable to Kendall (7) unless 
specific indication to the contrary. 

The variance of + when 


both rankings i 


there is a 


there a 


fT ade 
1)3 i As 
If only o 
to 


1¢@ Tat king contains t1 
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[5] may be used with only a slight 
error. Since the correction for ties 
will invariably reduce the variance, 
the use of the uncorrected formula 
will furnish a more conservative test 
of the null hypothesis. 

Kendall (7) provides probability 
tables for evaluating the significance 
of an obtained S (rather than its 7) 
when #310. Values of + required for 
significance at the .10, .05, and .01 
levels (or bevond, since 7 can take 
a limited number of values) for 
n’s from 4 through 10 are shown in 
Table e. 

When ties are present in one of the 
rankings, Sillitto’s tables (11) of the 
distribution of S for all possible num- 
bers of pair and triplet ties for small 
n's may be used. When other types 
of ties are present, or when both rank- 
ings contain ties, the evaluation of 
7 is not feasible if 2 1s 10 or less. 


only 


CORRECTION FOR CONTINUITY 

When the significance of 7 is evalu- 
ated using normal probability tables, 
it must be corrected for continuity, 
since S can not assume all values 
within the range +}n(m—1). Since 
m is fixed, an increase in P is accom- 
panied by a decrease in Q, and the 


When one ranking is a dichotomy consisting 
of x and y members so that (x+y)=n, the 
variance is 


me in’—n— >> (u'—u)}. 


u 


The variance when one ranking is a dichot- 
omy and the other contains no ties is 

4xv(n+1) 

o-= . 
3n?(n—1)? 
When both rankings are dichotomies with x 
and y, and » and q members respectively, the 
variance becomes 
ixyvpq 

n?(n—1)3 

found in 


The above formulae are to be 


Kendall (7 
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minimum change in S is thus 2. The 
appropriate correction for continuity 
is therefore to subtract 1 from the 
absolute value of S. This is equiva- 
lent to a deduction of 2/n(m—1) from 
the absolute value of 7, and the correc- 
tion may be applied at either point. 

This simple correction is appropri- 
ate when neither distribution con- 
tains ties, or when only one has ties. 
When one ranking consists entirely 
of ties of extent «, and the other 
ranking is a dichotomy, the correc- 
tion for continuity consists of sub- 
tracting uw from S, or 2u/n(n—1) from 
r. If both variates are dichotomies, 


the deduction for continuity from S 


is 4m or 1/(n—1) from +. 

In instances where both rankings 
contain ties but are not dichotomies, 
there is no simple way of applying a 
correction. Whitfield’s proposed cor- 
rection (13) for the case in which one 
variate is a dichotomy and the other 
contains ties of varying extents might 
be used for the general case of ties 
in both rankings. Whitfield’s method 
involves arranging the undichoto- 
mized ranking in the natural order 
and subtracting the extent of the ties 
involving the smallest and the great- 
est rank from twice the number of 
items ranked. This quantity is then 
divided by the number of intervals 
in the ranking. One-half of this quo- 
tient is the deduction from S for the 
correction. If 7 is corrected instead of 
S, the deduction is the quotient di- 
vided by n(m—1). The formal expres- 
sion for this correction for S is 

2n— 21 — 


: [6] 
2n, 


where 7 is the number of items ranked, 
v7; is the extent of the tie involving the 
smallest rank, vo is the extent of the 
tie involving the largest rank, and 
n; is the number of intervals in the 
ranking. (If a ranking had no ties, 
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nj=(m—1); in a dichotomy, m;=1.) 
In our illustrative problem (p. 340), 
n=8, v,=1, ve=1, and n,;=4. Ac- 
cordingly, the deduction from S$ 
would be 


(2*8—1-1) 
= 1.75. 
2x4 


The generalization of Whitfield’s 
procedure to the general case of ties 
in both rankings is apparently not a 
simple matter, and it has not yet been 
accomplished. A suggestion would be 
to consider the ranking with the fewer 
intervals (and the most tied items) 
as a dichotomy, and to apply Whit- 
field’s correction. This actually will 
provide an overcorrection for con- 
tinuity and hence a safer test of the 
null. 


CONFIDENCE LIMITS OF T 


It is often desirable to establish 
confidence limits for the parameter 
correlation when a significant sample 
coefficient has been obtained. For 
any value of a population T, the sam- 
pling distribution of 7 tends rapidly 
toward normality (though 
rapidly as in-the null case), provided 
that the absolute value of T is not 
too close to unity. The mean of the 
distribution is the population T, but 
the variance cannot be exactls de- 
termined unless something is known 
about the arrangement of ranks in 
the population, information which is 
almost always lacking. However, it 
can be shown that for any parameter 
T, the variance of r cannot exceed the 
value 


not so 


2(1—71°) 


maximum o,*= 
n 


Confidence limits of T can be set by 
substituting the value of the sample 
r in [7]. An alternate method is to 
solve equation [8] with the roots pro- 
viding the limits. The value of x is 
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the normal deviate corresponding to 
the desired probability level. 


FZ rx? 
ais V n \ st N se 
: es [8] 
1+ 


nN 


Since the limits determined by 
means of [7] and [8] are based on a 
maximum variance, the probability 
is at least, but not precisely, (1—P) 
that the true T within those 
limits. Unless m is fairly large, the 
magnitude of the limits will often be 
so great as to render them prac tically 
ixendall (7) has developed 
an additional method which involves 
the estimation of a parameter repre- 
senting the arrangement of ranks in 
the population from the obtained 
data. While this method frequently 
results in tremendous reductions in 
the extent of the confidence limits, it 
is too complicated and laborious for 
ordinary use. 


lies 


useless. 


SIGNIFICANCE OF A DIFFERENCE 
BETWEEN 7's 

Evaluating the significance of a dif- 
ference between two independent T's 
presents no special problems since 
such differences will be approximately 
normally distributed around a mean 
of zero ina test of the null hypothesis. 
The critical ratio which is 
tionally used in’ such 


conven- 
situations 1s 


applicable to r. The standard error of 


T O72" 


the difference is, as usual, y/o," 
where g,? is computed by [7]. 
If we wish to avoid using the sam- 
ple ras an estimate of T in computing 
the variances, we have recourse to a 
transformation called w, which is de- 
fined as sin-'z, in radians. Kendall 
(7) has shown that the sampling vari- 
ance of w can be maximized at 2/n, 
a value independent of the parameter 


w. The standard error of the differ- 
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ence between w, and ws can be maxi- 
mized at 


\ 


an expression which does not require 
an estimation of population w’s from 
the data. 

The w transformation may also be 
used to set confidence limits for a T, 
though there is no reason to feel that 
this would be a desirable practice. 
Limits set in this manner, while dif- 
fering slightly from those determined 
by [8], cannot be said to be more ac- 
curate, since it is not known whether 
the distribution of w is nearer nor- 
mality than that of +r. Furthermore, 
the computations involved in convert- 
ing from 7 to w and back again may 
very well exceed those required in 
solving [8] to obtain the limits. 


A COMPLETE COMPUTATIONAL 
EXAMPLE 
Consider the following set of rank- 
ings where the first has been arranged 
in the natural order: 


h 7 8 9 10 
} 


$243 8 
Computing S, we obtain +4, —§; 
+? —6:0, —7:0, —6:0, —5:0, —4: 
+2, —1;0, —2;and +1,0. The total 
for P is 9, the total for Q is 36, and 
S= —27. For the denominator, 
}n(m—1) =3(10)(9) =45. According 
to [1], >= —27/45= —.60. Entering 
Table 1 with an 1 of 10, we find that 
at of .60 is significant beyond the .05 
level. The precise p value is .0166. 
If we wish to use the normal ap- 
proximation, we require the standard 
error of r, and we must correct 7 for 
continuity. From [5], we compute 
the variance of 7 as .0617, and the 
standard error, .248. Applying the 
continuity correction at S, we re 
compute 7 from [1] thus: (—27+1)/45 
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=—.578. Or, correcting 7 itself; 
— .60+2/(10)(9) = —.578. (Since the 
correction is subtracted from the ab- 
solute value of S or 7, we add it toa 
negative statistic.) The critical ratio 
of r is thus —.578/.248 =2.33, which 
corresponds to a probability of .0198. 
Comparing this value with the prob- 
ability obtained from Table 1, we see 
that the normal approximation is 
slightly in error when 7 is as small as 
10, though it provides a somewhat 
more stringent null test. 

To set the confidence limits of T 
at the .05 level and beyond, we solve 
[8] with r= —.60 and x=1.96. The 
roots of the quadratic are —.93 and 
+.25, which are the limits of T. The 
finding is hardly illuminating, though 
not unexpected. Any correlation 
based on only 10 instances is bound 
to be an uncertain estimate of the 
population value. If we had used [7] 
to compute the limits, we would ob- 
tain a maximum standard error of 
.358 and limits of —.60+.70 at the 
.O5 level or beyond. 


PARTIAL RANK CORRELATION 


A procedure for computing a par- 
tial 7 when there are more than two 
rankings is described by Kendall (7). 
Suppose that we wish to determine 
the relationship between the rankings 
of Judges A and B with the ranking 
of Judge C held constant. Arrange 
the ranking of Judge C in the natural 
order, with those of Judges A and B 
beneath. There are n(m—1)/2 cou- 
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plets in each ranking, i.e., items 1 and 
2,1 and 3, . Land n, 2 and 3, etc. 
jn Judge C’s ranking, the order of 
magnitude of each couplet is the 
same; the one to the right is the 
larger. We determine (a) the number 
of couplets on which both Judge A 
and Judge B agreed with Judge C as 
to order, (6) the number of couplets 
on which both disagreed with Judge 
C, (c) the number on which A agreed 
and B disagreed, and (d) the number 
on which B agreed and A disagreed. 

These frequencies are now arranged 
in an ordinary 2X2 contingency table 
and the partial 7 of the rankings of 
Judges A and B independent of that 
of Judge C is defined as 


ab—cd 
V (atc)(at+d)(6+c)(b+d) 


TAB-C= 


[9] 


It so happens that 


TARB—TACTRC 
TAB-C >= 


= -» [10] 
V 1—tac?'V1—T3Bc* 


an expression which is analogous to 
that for the product-moment partial 
correlation coefficient. It happens 
further that tan.c =W/x?/n, which 
illustrates the relationship between 
partial 7 and the phi coefficient. 
Examples of the computation of 
partial 7 using [9] can be found in 
Kendall (7) and Smith (12). The lat- 
ter's example, though correct in form, 
contains arithmetic errors so that the 
computed partial is inaccurate. For 


TABLE 1 


VALUES OF 7 REQUIRED FOR SIGNIFICANCE AT THE .10, .05 AND .O1 LEVELS AND BEyonp* 


Level n 4 5 6 
.10 
.05 
01 


0.80 
1.00 


1.00 





* Based on Kendall's (7) tables. 


7 8 9 10 


0.62 
0.71 
0.90 





1 OS 
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most purposes [10] will be the more 
useful computational method. 

The use of partial tr when ties are 
present is questionable since [9] and 
[10] will give different results in such 
instances. This drawback, added to 
the fact that generally applicable 
tests of the significance of any partial 
7 are not yet available, limits the 
value of the statistic.® 

An expression for a multiple + has 
been developed by Moran (10), but 
the problems of the sampling distri- 
bution of multiple 7, although ap- 
parently simpler than those of partial 
7, have also not yet been solved. The 
usefulness of multiple 7, like that of 
partial 7, is limited at the present 


time. 


THE RELATIONSHIP 
BETWEEN 7 AND r 


When ranked data can be assumed 
to be based on continuous, normal 
distributions and is fairly large, an 
estimate of the parameter product- 
moment coefficient can be obtained 
by means of a transformation of r+. 
The formula for this transformation 
is 

- on 
in (radians 
? 


=sin 907 (degrees). [11] 
The significance of the estimated 
r can be tested by simply testing the 
7 from which it was derived, using 
normal tables and a variance com- 
puted by [5]. 
In the nonnull case, the distribu 


6 Hoeffding (5) shows that when neither 
Tac nor Tpc is unity, the distribution of yn 
(ran-c—Tap.c) is approximately normal for 
large n's with a mean of zero and a variance 
given by an expression which he derives. 
Furthermore, when Tac and Tpeo are zero, 
the distribution of «/"(ran-c—Tap.c) is the 
same, in the limit, as that of y (ran —Tap). 
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tion of sample 7’s will be approxi- 
mately normal for large n’s, with a 
mean of T and a maximum variance 
of 
maximum aie a [12] 
91-7) 


Confidence limits for T can be ob- 
tained using this variance, and cor- 
responding limits for the transformed 
r are computed by translating the 
limits of T into those for 7 using [11].? 

A comparison of the upper limit of 
the variance of 7 by [12] when nor- 
mality is assumed with its upper 
limit by [7] when no assumptions are 
made will show that the assumption 
of normality decreases the standard 
error of + by approximately 50 per 
cent in the nonnull case. On the other 
hand, if 7 is used to estimate r when 
the latter could be computed directly 
from the data, there will be a con- 
siderable loss of sensitivity since the 
standard error of the former is always 
greater than that of the latter. The 
ratio of the standard errors will vary 
from 1.2 when the variates are uncor- 
related up to approximately 1.9 when 
the true 7 is .90. 

The conversion formula for r from 
r is justified only by the assumption 
of normality of distribution of the 
variates, and when m is fairly large. 
Otherwise, it would seem advisable 
to avoid estimating r from ranked 
data, and to limit the conclusions to 
statements concerning rf. 


7 A standard error for r computed from + 
can be derived using the conversion formula 
(7). Its upper limit is 


‘1.37(1—#) (1-7?) 


4 


The procedure for setting limits for r by con- 
verting limiting 7’s into limiting r's is, how- 
ever, preferable because of the greater sym- 
metry of the distribution of r. 


n—1 
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